Sample size calculation: Cross-sectional studies

Let us consider the estimation of sample size for a cross-sectional study.

In order to estimate the required sample size, we need to know the following:

p: The prevalence of the condition/ health state. If the prevalence is 32%, it may be either used as such (32%), or in its decimal form (0.32).

q: i. When p is in percentage terms: (100-p)

ii. When p is in decimal terms: (1-p)

d (or l): The precision of the estimate. This could either be the relative precision, or the absolute precision. This will be discussed later in this post.

Za [Z alpha]: The value of z from the probability tables. If the values are normally distributed, then 95% of the values will fall within 2 standard errors of the mean. The value of z corresponding to this is 1.96 (from the standard normal variate tables).

The formula for estimating sample size is given as:

(Za)^2[p*q] where the symbol ^ means ‘to the power of’; * means ‘multiplied by’

N= d^2 that is, “Z-alpha squared into pq; upon d-square”

substituting the values of Za, we get:

N= (1.96)^2[p*q]

d^2

We can round off the value of Za (1.96) to 2, to obtain:

N= (2)^2[p*q]

d^2

or, N= 4pq/ d^2 that is, “4 pq by d-square”

Example:

I wish to conduct a cross-sectional study on awareness of Hepatitis B among school children. A literature search reveals that other investigators have reported knowledge to range from 5% to 20% among students of grades 6 through 8. What should the size of my sample be?

The formula requires us to input the value of d (precision). If the absolute precision is known, there is no problem. However, often we can only input a relative precision. Where do we get the value of relative precision from?

Typically, relative precision is taken as a proportion of ‘p’. The maximum permissible limit is 20% of ‘p’.

In the above example, if ‘p’ is 20%, then ‘d’ will be (20/100)*20= 0.2*20= 4 {Taking a relative precision of 20%}.

This means that we will be able to detect a ‘p’ (prevalence) of 18% or more {half the value of relative precision on either side of ‘p’–> +/- 2%: 18% to 22%}.

That is, by taking a relative precision of 20% of ‘p’, the study will be able to detect the true awareness level if the actual prevalence is 18% or more. If the actual prevalence is less than 18%, however, the study will be unable to detect it accurately.

Therefore, the larger the value of ‘p’ (prevalence), the larger the possible value of ‘d’ (relative precision), keeping ‘d’ fixed (say, at 20% of ‘p’). If the prevalence is 50%, ‘d’ (20% of ‘p’) would then be 0.2*50= 10 (as compared to ‘d’ = 4 when ‘p’ = 20%).

The reverse is also true: the smaller the value of ‘p’, the smaller the value of ‘d’. A smaller ‘d’ implies a larger sample size. Therefore, the choice of ‘p’ is crucial.

We can now input the values in the formula to obtain the sample size:

For the calculation we will take ‘d’ as 4. This yields:

N= (4*20*80)/ (4*4)

= 400 this sample size will enable us to detect the truth if the prevalence is between 18-22% (or more).

If we took ‘p’= 5, then the sample size would be:

N= (4*5*95)/(1*1) [‘d’= 0.2*5= 1]

= 1900 this sample size will enable us to detect the truth if the prevalence is between 4-6% (or more).

So should I take ‘p’= 20% or ‘p’=5%?

That depends upon:

1. The location of the original study- if you are planning to conduct the study in an urban area, use the prevalence reported by studies conducted in urban areas, and vice versa.

2. The available resources (time, manpower, money, etc.). Aim for the largest feasible sample size. The size should be adequate to yield 80% power. Do not unnecessarily increase the sample size unless the intention is to obtain greater power. If so, please mention the same in the methodology section.

3. The results of your pilot study. If you have conducted a pilot study, the prevalence obtained from that study should be taken as ‘p’. This will be much more accurate than any other external value.

Note 1: If you have multiple objectives, you must calculate the required sample size for each objective, then choose the largest sample size thus obtained. This will ensure adequate power for all objectives, else the study will lack power for one or more objectives. That is, you may not be able to detect a significant result where it actually exists because you failed to include enough subjects to detect it.

Note 2: It is advisable to mention a range rather than a single value for sample size. This is standard practice in the west, but not in India. A range may be obtained by calculating the sample size for different values of ‘p’.

282 thoughts on “Sample size calculation: Cross-sectional studies”

Sekartaji February 20, 2017 at 8:41 AM

Dear Dr Roopesh,

I would like to conduct a cross sectional study and I have difficulties to find the formula to calculated my sample size because the population is quite huge about 211,857. I am going to survey the knowledge, health belief and intention of female adolescent towards HPV vaccination and no previous study had ever done about this topic in my country. Could you please give me an advice about that matter?

Your help is greatly appreciated.

Sincerely,
Sekartaji

LikeLike

Reply ↓
1. drroopesh Post authorFebruary 20, 2017 at 1:06 PM
  
  Dear Sekartaji,
  
  If I understand the question correctly, you want to know how to compute sample size from a population of 211,857 individuals.
  
  Please use the prevalence from the following (and similar) articles to estimate the required sample size using the formula for cross-sectional studies:
  https://www.ncbi.nlm.nih.gov/pubmed/24188759
  
  In order to obtain your sample, you might consider cluster or multi-stage sampling.
  
  Hope this helps.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
David Lazarus March 2, 2017 at 1:33 PM

What are the possible reasons for increasing sample size for cross-sectional studies?

LikeLike

Reply ↓
1. drroopesh Post authorMarch 3, 2017 at 7:04 AM
  
  Dear David,
  
  It is not ethical or practical to unnecessarily inflate the sample size for any study.
  
  The commonest reason for wanting to do so would be to increase the power of the study to detect even minor differences of interest.
  
  Another reason could be the desire to capture as much variation in the population as possible. However, this could be achieved by adopting a good sampling method.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
  1. David Lazarus March 3, 2017 at 7:50 AM
    
    Thank you so much for the answer and well appreciated.
    
    LikeLike
    
    Reply ↓
Achanya April 1, 2017 at 5:26 PM

How do I calculate the sample size for which the cases will be matched with control, give previous study gave prevalence of 32%.

LikeLike

Reply ↓
1. drroopesh Post authorApril 4, 2017 at 7:39 AM
  
  Dear Achanya,
  
  Do you intend to have 1:1 matching, or higher?
  
  I hope you realize that in a case control study one is comparing proportions of outcome between cases and controls.
  Therefore, for sample size calculation, you need to provide proportions for both cases and controls.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Winfred Nelson April 11, 2017 at 3:58 PM

when calculating sample size for three communities using sloven’s formula, if you add total for the three (for example 1474) and calculate you get about half the size ( 315) then you can use proportion formula to redistribute. However, if you were to calculate for each of the communities with populations 350, 774 and 350 you get a total of 624. Now, if I am using a mixed methods what number should I interview 315 or 624?

LikeLike

Reply ↓
Winfred Nelson April 11, 2017 at 4:16 PM

In fact the design is exploratory sequential so I will do a questionnaire survey generalise results and based on that select my qualitatives ( FGDs and Indepth interviews etc. The three communities are made up of farmers who all practice rainfed farming, but farmers from 2 of the communities also practice dry season farming because they use small scale dams during the dry season. Again what are my justifications for interviewing 315, and not 624 is it okay so I do not incur unnecessary cost ?

LikeLike

Reply ↓
1. drroopesh Post authorApril 12, 2017 at 12:51 AM
  
  Here’s a link to another useful document- by Creswell himself:
  http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1047&context=dberspeakers
  
  You might also find the following useful: http://epubs.scu.edu.au/cgi/viewcontent.cgi?article=1069&context=comm_pubs
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
  1. Winfred April 12, 2017 at 12:08 PM
    
    Thanks very much Dr. Roopesh. Have downloaded the materials and take a critical look at them. If there are any issues thereafter , I will get back. Have a good day.
    
    LikeLike
    
    Reply ↓
Winfred Nelson April 11, 2017 at 4:17 PM

EXPLANATORY SEQUENTIAL rather by Creswell

LikeLike

Reply ↓
1. drroopesh Post authorApril 12, 2017 at 12:38 AM
  
  Dear Winfred Nelson,
  
  Please go through the following document for clarity on Mixed Methods Research:
  
  Click to access Prof._Dr._Burke_Johnson_Mixed_Methods_PRIMER.pdf
  
  The sample size would be influenced by the choice of qualitative approach; as well as the relationship between quantitative and qualitative samples- identical, nested, parallel or multilevel.
  
  Hope this helps.
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Qusay April 11, 2017 at 5:34 PM

I Would like to conduct a study which hasn’t been done in my country, so how can I estimate a sample size. My study is the influence of body mass index on liver size.
Regards,

LikeLike

Reply ↓
1. drroopesh Post authorApril 19, 2017 at 6:54 AM
  
  Dear Qusay,
  
  Even though the study hasn’t been conducted in your country, it is possible to estimate sample size.
  
  From literature, identify the findings reported by other investigators. They would likely have reported several measures- AP diameter/ Transverse diameter/ Volume, etc. Determine which measure is of importance to your study, and note the relationship between BMI and that specific measure.
  
  Identify a study that was conducted in a setting similar to your own (even if in another country, factors like setting (rural/ urban); economic status (developing/ developed); etc. could be similar).
  
  Then determine what proportion of subjects in that study have the relationship of interest. Use that to estimate sample size using the formula provided in the article above.
  
  Hope this helps.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Anonymous April 18, 2017 at 6:08 PM

Hi
i am going to conduct a cross section study about the prevalence of cancer in ladys around the age of the menopause with an ovarian cyst and looking of a biochemical marker called Ca 125
still i am unable to calculate the sample size ?

LikeLike

Reply ↓
1. drroopesh Post authorApril 19, 2017 at 6:46 AM
  
  Dear Someone,
  
  Please perform a detailed review of literature and determine what proportion of perimenopausal women with ovarian cysts have elevated Ca 125 levels.
  
  Use that proportion to estimate sample size by substituting in the formula provided in the article above.
  
  If you get a range, estimate sample size using the lowest proportion, and use that to conduct your study if feasible.
  
  Regards,
  Dr.Roopesh
  
  LikeLike
  
  Reply ↓
Solomon May 30, 2017 at 11:14 AM

am going to do survey on bankingt industry . but there population size are different from one another. how am i going to deaal with that please help

LikeLike

Reply ↓
1. drroopesh Post authorMay 31, 2017 at 3:23 PM
  
  Dear Solomon,
  
  You could try using cluster sampling method to conduct your survey. Each Bank would constitute a cluster, and you could perform sampling proportionate to size.
  
  If restricted to branches of a single bank, clusters could be determined on the basis of zones or regions, with business handled (in money terms- $, ₹, etc.) determining the proportionate size of each cluster.
  
  Hope this helps.
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Boniface June 24, 2017 at 10:02 AM

Dear Dr Roopesh,

I am conducting a cross sectional study on prevalence of cardiomyopathy among diabetes patients. Similar study done in my country showed a prevalence of 40%. I used the above formula for cross- sectional studies and used relative precision, 20%(of 40%). I was asked by my university research committee, why have I chosen relative precision instead of absolute precision. Initially when I was writting my proposal I tried absolute precision and it had given me a high sample of 334. When i used a relative precision, 20%(of 40%), it had given me,144, which I preferred (due to the limited study budget). How do you think I should answer the above question? And help me specifically with reasons for using relative precision instead of absolute precision?

LikeLike

Reply ↓
1. drroopesh Post authorJune 26, 2017 at 10:54 AM
  
  Dear Boniface,
  
  Please read the article on relative and absolute precision:
  https://communitymedicine4asses.wordpress.com/2014/12/30/relative-and-absolute-precision-in-sample-size-calculation/
  
  The article has links to useful sources. You may benefit from reading both.
  
  I hope this helps.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Joshua C June 30, 2017 at 9:44 AM

I am conducting a research on Sleep disorders in children with enlarged adenoids and tonsils in a hospital in Nigeria.Kindly help me with the type of study design and sample size calculation since I could not find a similar study and prevalence

LikeLike

Reply ↓
1. drroopesh Post authorJuly 3, 2017 at 11:46 PM
  
  Dear Joshua,
  
  Please state your research question (in PICO format) and objective(s).
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Ramkumar July 2, 2017 at 4:54 PM

hello,drroopesh im planning conduct cross sectional study of tb cervical lymphadenopathy clinico patho and demographic profile without folllowup for minimum of 1 yr … i dont know how to to calculate sample size .. previous studies are there but they are having indifferent sample size .. and pls help help me to calculate sample size of around 100

LikeLike

Reply ↓
1. drroopesh Post authorJuly 3, 2017 at 11:39 PM
  
  Dear Ramkumar,
  
  Please state your objective(s), study population and outcome measure(s).
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
  1. Anonymous July 10, 2017 at 2:10 PM
    
    objectives in the mean of demographic and clinico pathological profile , study population is op patients and in ward patients , outcome measures based final reports
    
    LikeLike
    
    Reply ↓
  2. Anonymous July 10, 2017 at 2:17 PM
    
    THESIS PROTOCOL
    
    CLINICO-PATHOLOGICAL AND DEMOGRAPHIC PROFILE OF
    TUBERCULAR CERVICAL LYMPHADENOPATHY
    Thesis Protocol Submitted For
    DIPLOMATE OF NATIONAL BOARD
    (RESPIRATORY MEDICINE)
    
    AIMS AND OBJECTIVES
    
    PRIMARY OUTCOME
    
    • TO STUDY THE CLINICO-PATHOLOGICAL AND DEMOGRAPHIC
    PROFILE OF TUBERCULAR CERVICAL LYMPHADENOPATHY PATIENTS
    
    MATERIAL AND METHODS
    
    STUDY DESIGN
    
    The present study is proposed to be a Cross-Sectional study will be conducted in NATIONAL INSTITUTE OF TB AND RESPIRATORY DISEASES where the patients in both OPD and IPD.The patients will be enrolled between aug’2017 to dec’ 2018 will be part of the study .
    
    STUDY METHOD
    
    Patients who are attending OPD and pt’s in IPD will be enquired about detailed history and through clinical examination will be done.Followed by all routine investigations and special tests like mantoux test, usg abdomen and FNAC of lymphnode with sample direct smear, cytopathological examination and culture for MTB will be done at NITRD. And finally reports will be analyzed as in the profoma.
    
    SAMPLE SIZE AND STUDY PERIOD
    
    The expected patients in the study will be between aug’2017 to dec’2018 who are giving consent for the study and those who are eligible for study.
    
    CRITERIA FOR SELECTION OF PATIENTS
    
    Inclusion criteria;
    • All patients who are agree to participate in the study.
    Exclusion criteria;
    • Patients who are not willing to participate in the study.
    • Patients with primary diagnosis of other diseases(e.g: cancer,sarcoidosis, pyogenic infections & etc).
    REVIEW OF LITERATURE
    DEMOGRAPHIC INCIDENCE;
    Mm rahman et al Out of 60 patients 40 were female and 20 were male and female male ratio was 2: 1. The most vulnerable age group was the 2nd decade 23(38.33%). The present study shows that the peak age incidence is 2nd decade of life (38.3%) and the 2nd highest incidence 3rd decade with 30%.
    
    Hussain et al out of 50 patients Male to female ratio is 2.1:1 most common during 2nd and 3rd decade of life (52% )with a peak incidence in the 2nd decade (32%).
    
    Devendra et al Out of 118 cases was found to be more prevalent in females as 30 out of 54(55.55%). In this study, we found out that TBL are commoner in 13-30 age groups, 83.33% .
    
    Vasuda et al out of 227 There were 113 (49.7%) female and 114 (50.3%) The maximum number [167 (73.6%)] of cases suggestive of cytomorphology of tubercular lymphadenitis were aged in the range of 11–30 years.
    
    Shaukat et al total 110 cases Out of these 42(38.1%) were males and 68(61.8%) were female. The majority of patients were in the age range between 10 to 30 years and next group belong to the 4th decade.
    Rasool et al Total 46 of which cases Female gender was found in the majority 28(61.87%) while male gender was 18(39.13%).
    Soumya et al A total of 63 patients were enrolled in the study of which 25 were males and 38 females The most commonly affected group in the study was 15–24 years age comprising of 57.1% (36 cases).
    
    Mohammed ali et al 115 cases there were 71 males and 44 females. The male to female ratio in present study was 1.61:1The majority ofpatients affected were in the age group of 13 to 20 years (39.13%) followed by 21 to 30 years (28.70%). The least affected age group was 61 to 70 years (1.74%).
    
    Chaitali et al Data of 80 patients was analyzed in this study.Gender wise 57 (71.3%) were females and remaining 23 (28.7%) were males.
    
    Naresh et al Males 48% and females 52%. In 50 cases the disease commonly affected the affected were 2nd decade 18% and 3rd decade 8% respectively. Commonest age group affected is between 11and 20> 21, and 30 closely followed by 31 and 40 years .
    
    CLINICAL PRESENTATION
    
    Karthi et al, Majority did not have symptoms 16 cases (31.4%) out of 51 showed symptoms fever was the most common , seen in 31% of cases, followed by malaise in 18% . It was observed 8 cases (15.6%) out of 51 cases had a positive history contact with tb . It was observed that posterior triangle was the commonest to get involved (31.3%) followed by upper deep jugular (21.5%). Levels 1, 3 and 4 were equally involved.And the majority of nodes (78.4%) were 4 cm. It was seen in 41 cases out of total 51 cases (80.3%) had U/L involvement. The remaining (19.7%) had bilateral involvement. and multiple node involvement in 39 cases (76.5%) while 12 cases (23.5%) showed single. Matting was observed in 14 of the 51 cases (27.4%). discrete lymph nodes which was present in 37 of the 51 cases (29.7%).
    
    Mohankumar et al 18 cases (27.69%) out of 65 cases of tubercular showed presence of symptoms. It was observed that only 4 cases (6.15%) out of 65 cases had a positive history.It was observed that the majority of nodes affected in tuberculosis (80%) were less than 4 cm in size it was observed that Upper jugular group (level-2) was the commonest to get involved in tuberculosis (30.76%) .2-5 Among the cases only 15.39% cases presented with bilateralnode
    
    mmrahman et al Out of 60 patients BCG vaccination had a significant protective role; 19(31.67%) were vaccinated and 41 (68.33%) wereTuberculin test was positive in 44(73.34%) and negative in 2 (3.33%) and doubtful in 14 (23.33%).The common presentations were neck swelling 60 (100%), fever 40 (66.67%) and night sweat in 30(50%), wt loss 21(35%).
    
    Devendra et al In this study 1-2 cm size group were found to be having equal chances of tubercular and non-specific reactive lymphadenitis but 78.94% lymph nodes with size >2 cm were positive for tubercular lymphadenitis .Fever> anorexia>malaise>night sweats & weight loss was commoner symptoms in TBL
    
    Vasuda et al The study having 227 tb cervical lymphadenopathy pts
    The majority of the patients were otherwise healthy adults, and constitutional symptoms were present in 13% only. All the groups of cervical lymph node were involved including right and left cervical, posterior triangle, submental, submandibular, and supraclavicular regions.
    
    Zyedzulfiquer et al Study having 242 cases of tb cervical lymphadenopathy
    Most common constitutional symptoms are fever as wt loss(75%), night sweats(72%), LOA(45%).Most of the patients don’t have active contact only 28% had contact and 28% had past h/o tb treatment duration of lymphadenopathy in most of cases was less than 3 months.The size of Lymph Node was more than 1 cm and less than 2 cms in 70% of the patients. Gross appearance of Lymphadenopathy was multiple mattered in 65% of the patients with no tenderness in 78%
    
    Salman et al study population is 50 patients.Symptoms vary from 6 months to 2 yrs but m/c 7 wks to 3months 39 patients didn’t have any constitutional symptoms and remaining m/c had fever>malaise> LOA. H/O tb contact history was present in 19 patients. Examination showed b/l seen in 60% and location m/c post triangle(70%) f/b upper deep cervical(24%) and most of the lymphnode size was <1.15cm.
    
    Shaukat et al study population was 80 patients. In our study fever and weight loss are common complaint 52.7% and 63.6% respectively And b/l more common than unilateral and anterior group of nodes are more common than post group of nodes
    Rasool et al Multiple lymphadenitis was found in majority of the cases 26(56.53%), while 20(43.47%) cases were found with presentation.We found lymph node less than 3 CM found in 31(67.39%) cases and more on of single lymphadenitis than 3 CM were in15 (32.61%) cases. Fever was commonest clinical feature in 76% cases, following by swelling, abscess, solid nodes, weight loss, loss of appetite and others were noted with percentage of 55.69%, 39.13%, 45.65%, 58.69% and 21.73% respectively
    
    CYTO PATHOLOGICAL, CULTURE AND DIRECT SMEAR EXAMINATION
    Karthikeyan et al Out of the 51 histopathologically confirmed cases of tuberculous cervical lymphadenitis, a diagnosis of tuberculosis was made in 43 cases by FNAC. The other 7 cases were diagnosed as chronic non-specific lymphadenitis. There were no false positive cases on FNAC. 44 cases were true negative for tuberculosis. The sensitivity and specificity of FNAC for diagnosing tuberculous lymphadenitis is therefore 86% and 100% respectively .
    
    Mohan kumar et al In the present study, both sensitivity and specificity of FNAC for for tuberculosis sensitivity was only 86.20% and specificity was 100%.
    
    Mm rahman et al In this study among 60 patients 44 (73.34%) were tuberculin positive (more than 10 mm induration), 14 (23.33%) were doubtful (between 1-10 mm) and 2 (3.33%) were negative(no induration seen Among the 60 patients of tuberculouscervicallymphadenitis 51 (85%) had caseation.
    
    Vasuda et al In this study, the cytomorphological features observed in the cases were caseating epithelioid granulomas [47.6%(108/227)], granulomatous lymphadenitis [33.9% (77/227)], necrotizing lymphadenitis [1.8% (4/227)], and necrotizing suppurative lymphadenitis [16.7% (38/227)] of cases. ZNstaining for AFB was done in all the cases. Smear positivityfor Mycobacterium sp. by conventional ZN method was 19.4% (44/227). AFB positivity was the maximum (44.7%) in necrotizing suppurative lymphadenitis .
    The appearance of aspirates found more commonly was blood mixed in 68.3% cases, followedby whitish cheesy material in 21.1%, pus-like in 6.2%, and yellowish in 4.4%. AFB positivity was the maximum (42.8%)in pus-like aspirate.
    
    Salman et al The study having population of 50 cases of which 41(82%) cases have been confirmed by FNAC. AFB seen in by direct smear examination in 12 cases and 9(18%) needed excisinal biopsy to confirm the diagnosis.
    
    Soumyajit et al FNAC was diagnostic in 42 cases (73.7%) where epitheloid granuloma and Langhan’s cells with or without necrosis was seen. The aspirate from affected lymph nodes did not reveal AFB in most of the cases. Only 23 samples (40.4%) revealed AFB after ZN staining. FNAC was non specific in 15 samples which further required incision/ excision biopsy for diagnosis.
    
    PROFORMA
    
    CASE NO: OPD REG NO:
    NAME: FATHER/HUSBAND NAME:
    AGE: SEX:
    OCUPATION: MARIETAL STATUS:
    AREA:
    
    PRESENTING COMPLIANT: DURATION
    LYMPHNODE ENLARGEMENT:
    FEVER:
    COUGH:
    WEIGT LOSS:
    LOSS OF APPETITE:
    CHEST PAIN:
    OTHERS POSITIVE HISTORY:
    
    PAST HISTORY:
    TUBERCULOSIS:
    HYPERTENSION:
    DIABETES:
    HIV:
    SURGICAL INTERVENTION:
    BLOOD TRANSFUSION:
    OTHER PAST SIGNIFICANT HISTORY:
    
    PERSONAL HISTORY:
    H/O SMOKING:
    H/O ALCOHOL:
    H/O DRUG ABUSE:
    BLADDER AND BOWEL COMPLIANT:
    H/O CONTACT WITH TB:
    NO OF CHILDREN:
    
    TREATMENT HISTORY:
    H/O ATT:
    ANY OTHER MEDICATION:
    
    GENERAL EXAMINATION:
    TEMPERATURE:
    B.P: PULSE: RESPIRATORY RATE:
    PALLOR: ICTERUS: CLUBBING: CYANOSIS: PEDAL EDEMA:
    BCG SCAR:
    LYMPHNODE :
    
    SYSTEMIC EXAMINATION
    CVS:
    
    RS:
    
    P/A:
    
    CNS:
    
    INVESTIGATIONS REPORTS;
    HB: TLC: DLC: ESR:
    Blood sugar(random): UREA: CREATININE:
    S.BILIRUBIN:Total- Direct- SGOT/SGPT/ALP:
    S.PROTEIN:Total- Albumin-
    URINE:Albumin- sugar- microscopy
    Sputum for AFB(D/S):
    X-ray CHEST:
    USG abdomen:
    FNAC report:
    AFB by D/S:
    CULTURE report:
    
    LikeLike
    
    Reply ↓
    1. drroopesh Post authorJuly 15, 2017 at 9:41 AM
      
      I am not sure I understand what exactly you intend to do.
      
      You will recruit patients with tuberculous cervical lymphadenopathy, and obtain some information- this much is clear.
      
      What is not clear is what question you are trying to answer by collecting that information. That is why I requested you to provide your research question in PICO format.
      
      Please note that unless you provide an answerable research question, I will be unable to provide additional assistance.
      
      Regards,
      Dr. Roopesh
      
      LikeLike
      
      Reply ↓
adaze woghiren July 6, 2017 at 1:18 PM

hello please i m trying to correlate two variables in estimating the severity of chronic liver disease how do i go about calculating my sample size since it is a cross sectional study m conducting, thanks.

LikeLike

Reply ↓
1. drroopesh Post authorJuly 9, 2017 at 9:07 AM
  
  Dear Adaze,
  
  Please use the formula provided in above: 4pq/ l^2.
  
  If you provide details of your objectives and outcome variables, I might be able to provide specific guidance.
  
  Please note that I will be very busy this week, so might not be able to respond before the weekend.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
sara July 8, 2017 at 4:15 PM

hello dr.
my study is to identify the number of stem cells in diabetic patients group and non diabetic group then compare between tow groups. so is it comparative cross sectional design or case cnotrol? and how i can estimate the sample size?

LikeLike

Reply ↓
1. drroopesh Post authorJuly 9, 2017 at 9:24 AM
  
  Dear Sara,
  
  What is your research question? The study design is determined by the research question.
  
  Please formulate your research question using the PICO criteria and revert to me.
  
  Please note that I will be very busy over the coming week, hence might be unable to respond before the weekend.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
  1. sara July 10, 2017 at 4:31 AM
    
    thanks dr. for replying…
    my research question is:
    in mild gestational diabetic women, is the number and quality of the haematopoietic stem cells of umbilical cord blood affected compared to non-gestational diabetic women?
    
    LikeLike
    
    Reply ↓
bonifacelumori July 13, 2017 at 8:32 PM

Dear Roopesh,

I am still confused about sample size calculation. My study is on prevalence and factors associated with cardiomyopathy among diabetic patients. I wanted to used a prevalence of 67.8 ( a similar study done in my country). Please show me how your sample size will be, so that I can compare with what I got( which I think is not correct). Use absolute precision and 95% confident interval.

With regards,
Boniface

LikeLike

Reply ↓
1. drroopesh Post authorJuly 15, 2017 at 9:22 AM
  
  Dear Boniface,
  
  Please read page 59 of the following document:
  
  Click to access 03_MainPaper_Sampling.pdf
  
  I believe this will help resolve your doubts.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Kunle September 8, 2017 at 11:52 AM

Hi, kindly clarify which formula I need to use to calculate the sample size for my study “Cryptosporidium parvum among HIV positive and Seronegative subjects attending National Hospital, Ilado”.
The main objective is to compare the prevalence of C. parvum is these group of people. The study design is comparative cross-sectional

LikeLike

Reply ↓
1. drroopesh Post authorSeptember 9, 2017 at 12:05 AM
  
  Dear Kunle,
  
  The formula would be the same as mentioned in the article: 4pq/l^2.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Anonymous September 20, 2017 at 1:36 PM

Dear dr Roopesh,
i m fawad qazi doing start research on “cadiopulmonary fitness in DOW medical university” by 1 mile walk test (rockport test). need ur help to calculate sample size.

LikeLike

Reply ↓
1. drroopesh Post authorSeptember 20, 2017 at 4:55 PM
  
  Dear Fawad,
  
  You will have to state your research question in PICO format, objective(s) and outcome measure(s).
  
  Please go through previous comments in this thread, and related articles on this blog as well.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
annonymous September 28, 2017 at 10:44 AM

Dear Dr Roopesh i am conducting a comparative sexual abuse study among adolescents in and out of school. i want a sample size of 520 for each group, what prevalence can i use to arrive at that using the sample size formula for comparing proportions, please help . Thanks

LikeLike

Reply ↓
1. drroopesh Post authorSeptember 29, 2017 at 7:50 AM
  
  Dear Anonymous,
  
  Please note that one does not decide the sample size in advance and then reverse engineer to determine the prevalence.
  
  What you need to do is determine the prevalence from literature, then use the prevalence values thus obtained to estimate sample size. This needs to be done for each objective. Finally, select the largest sample size estimate obtained as your required sample size.
  
  Regards,
  
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Ann October 26, 2017 at 7:38 AM

Dear Dr Roopesh,
i would like to conduct a comparative cross sectional study comparing the mean of an analyte in 4 different cohort of patients, how do i calculate the sample size?
Thanks.
Regards

LikeLike

Reply ↓
1. drroopesh Post authorOctober 28, 2017 at 12:55 AM
  
  Dear Ann,
  
  Please state your research question (in PICO format), and objective(s).
  
  Regards,
  
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
  1. salman karim November 1, 2017 at 3:04 PM
    
    Dear Dr Roopesh, please can you guide me about how we calculate sample size for behavioral sciences studies ( related to student psychology).
    
    thank you salman karim
    
    On Sat, Oct 28, 2017 at 5:55 AM, communitymedicine4asses wrote:
    
    > drroopesh commented: “Dear Ann, Please state your research question (in > PICO format), and objective(s). Regards, Dr. Roopesh” >
    
    LikeLike
    
    Reply ↓
    1. drroopesh Post authorNovember 6, 2017 at 12:06 AM
      
      Dear Salman Karim,
      
      The calculation depends upon the type of study and variables under consideration.
      
      The simplest approach is to determine sample size based on the type of study, as described here.
      
      The procedure would remain the same:
      1. State your research question (PICO format)- determines the study design
      2. State your objectives- provides information about the outcome variable(s) under consideration.
      3. From literature, determine values for outcome variable(s)
      4. Substitute values in appropriate formula
      5. Obtain required sample size.
      
      I hope this helps.
      
      Regards,
      Dr. Roopesh
      
      LikeLike
      
      Reply ↓
      1. Anonymous November 13, 2017 at 4:19 AM
        
        Thank you so much
        
        LikeLike
        
        Reply ↓
Hasan November 6, 2017 at 7:02 AM

dear Dr. Roopesh
I am going to conduct a research to identify the factors affecting the patient satisfaction on rehabilitation service quality. hereby i used cross-sectional study design with a questioner. but i faced challenges to calculate my sample size. so could you give me any ideas, please?

LikeLike

Reply ↓
1. drroopesh Post authorNovember 8, 2017 at 8:39 AM
  
  Dear Hasan,
  
  Please state your research question, objective(s) and outcome measure(s).
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
priti sapkota December 20, 2017 at 10:37 AM

how to calculate sample size for cross sectional comparative study

LikeLike

Reply ↓
1. drroopesh Post authorDecember 20, 2017 at 4:03 PM
  
  Dear Priti,
  
  All epidemiological studies include comparison(s). Therefore, the study design is ‘Cross-Sectional Study’, not ‘Cross-Sectional Comparative Study’.
  
  You may use the formula provided in the article to estimate sample size.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
priti sapkota December 20, 2017 at 5:11 PM

Calculation of sample size:
Based on the study conducted by Johncy SS, Samuel TV, Jayalakshmi MK, Dhanyakumar G, Bondade SY. Prevalence of respiratory and non-respiratory symptoms in female sweepers, the sample size will be calculated for two proportion cases and controls. The study shows respiratory symptoms cough in 13.3% of controls and 36.6% of cases.
Hence, Prevalence in cases (P1) = 0.366
Prevalence in Controls (P2) = 0.133
q1= 1-P1 =1-0.366=0.634
q2=1-P2= 1-0.133 = 0.867
Zα/2 at 95% = 1.96
Zβ at 80% power = 0.846
Ṕ= P1+P2/2 = 0.366 + 0.113/2 = 0.2495
Ǭ = 1- Ṕ = 1-0.2495 = 0.7505
Sample size (n) = { Zα/2 √2 Ṕ Ǭ+ Zβ √P1q1 +P2q2}2
(P1-P2)2

= {1.96√2×0.2495×0.7505 +0.846√0.366×0.634 + 0.133x 0.867} 2
(0.366-0.133)2
= 53 in each group
Adding around 10% for non-response, a total of 118 samples in which 59 in sanitation workers and 59 in comparison group will be enrolled.

is this calculation of sample size correct for cross-sectional study.

LikeLike

Reply ↓
1. drroopesh Post authorDecember 21, 2017 at 12:31 AM
  
  Dear Priti,
  
  The study design is Case Control Study.
  Please go through the following:
  
  Click to access SSCCDoc.pdf
  
  You may use the following online tools to calculate sample size:
  http://www.openepi.com/SampleSize/SSCC.htm
  
  http://sampsize.sourceforge.net/iface/s3.html
  
  Hope this helps.
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
2. drroopesh Post authorDecember 21, 2017 at 12:33 AM
  
  Dear Priti,
  
  Please note that in cross-sectional studies as well as case control studies, there is no need to adjust for non-response, since there is no follow-up, and those who don’t wish to participate are simply excluded from the study.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
priti sapkota December 21, 2017 at 1:27 AM

thank you. could you please provide me an example calculation of sample size of cross-sectional study where comparison is used.

LikeLike

Reply ↓
1. drroopesh Post authorDecember 21, 2017 at 8:00 AM
  
  Dear Priti,
  
  Like I said earlier, comparisons are integral to epidemiological studies. For the purpose of sample size estimation, one only needs details of the variables in the formula- described in the article.
  
  Having said that, one could estimate sample size based on the type of variables under study, and differences between them-
  difference between two means
  difference between two proportions, and so on.
  
  A good example of how that works is using the free tool G*Power:
  http://www.gpower.hhu.de/en.html
  
  Click to access 10.3758%2FBF03203630.pdf
  
  You could also read the following article:
  https://communitymedicine4asses.wordpress.com/2014/04/18/sample-size-calculation-two-ways-of-approaching-it/
  
  I hope this helps.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
priti sapkota December 21, 2017 at 10:56 AM

thank you very much

LikeLike

Reply ↓
Mira December 30, 2017 at 11:12 AM

Dear Dr Roopesh,

I’m conducting a cross sectional study among a population of 162 workers, I have calculated my sample size using two proportions P1 and P2 formula, but the sample size obtained is 263 which is bigger than the population. May I know how can I correct the sample size so that it will be less than the population?

Thank you and regards,
Mira

LikeLike

Reply ↓
Mira December 31, 2017 at 10:30 AM

Dear Dr Roopesh,

I have a population of 162 workers for my study, but the sample size calculated is bigger than my population which is 263. May I know how can I correct the sample size to be smaller than the population? Thank you

LikeLike

Reply ↓
1. drroopesh Post authorDecember 31, 2017 at 11:27 AM
  
  Dear Mira,
  
  I suggest you use finite population correction.
  The following should help:
  
  Click to access section7_3.pdf
  
  https://onlinecourses.science.psu.edu/stat414/print/book/export/html/264
  http://www.statisticshowto.com/finite-population-correction-factor/
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
2. michael December 9, 2019 at 10:33 AM
  
  you can use the reduction formula
  n/1+n/N
  263/1+263/162=100.3
  n=your sample
  N= total population you have
  
  LikeLike
  
  Reply ↓
Hassan Benya February 9, 2018 at 6:47 PM

I would like to conduct a cross sectional study but somehow confused to calculate my sample size because the population is quite huge about 1,055,964. I am doing a hypothetical study project title “Is there a relationship between socio-economic status and the risk of acquiring hepatitis B infection in Freetown, Sierra Leone” . Kindly I need your help on this.

LikeLike

Reply ↓
1. drroopesh Post authorFebruary 11, 2018 at 8:00 AM
  
  Dear Hassan,
  
  The calculation of sample size remains largely unchanged. What you need to determine is the sampling method. Perhaps, multi-stage sampling will be suitable in your case.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
hoodo March 27, 2018 at 4:47 PM

hello Dr
my study design is cross-sectional study by collecting milk samples from various milk vendors and interview and observation of milking process and milk handling practices and of milk vendors and milk producers.
which sample size i use

LikeLiked by 1 person

Reply ↓
1. drroopesh Post authorMarch 29, 2018 at 2:40 AM
  
  Dear Hoodo,
  
  Thanks for writing in.
  
  What is/are the objective(s) of your study?
  
  Regards,
  Dr. Roopesh
  
  LikeLiked by 1 person
  
  Reply ↓
Ange June 20, 2018 at 3:26 AM

Hi Dr. Roopesh

I am doing a cross sectional study on determining bone mass in the lower limb of post-menopausal women and compare this with bone mass at other sites in these same women. How do I calculate the sample size for this project?

LikeLike

Reply ↓
1. drroopesh Post authorJune 20, 2018 at 8:53 AM
  
  Dear Ange,
  
  You might want to consider cluster sampling. The following might be useful:
  
  http://www.statisticshowto.com/what-is-cluster-sampling/
  
  Please read the following for a more detailed discussion of cluster sampling. Sample size calculation is discussed from page 12 onwards.
  
  I hope this helps.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Pingback: How to calculate Sample Size with Epi Info 7: Cross-Sectional studies | communitymedicine4asses
lulu February 5, 2019 at 4:18 AM

i am doing a cross section study about the level of fruits and vegetables among adolescents in day and boarding schools how do i calculate the sample size if i don’t have the value of p

LikeLike

Reply ↓
1. drroopesh Post authorFebruary 6, 2019 at 3:47 PM
  
  Dear Lulu,
  
  In case you don’t have a value of prevalence from literature, you may estimate the same from observations (yours and others’). Preferably, you must guess a maximum possible and minimum possible value, then calculate sample size using both values. That will give you a range of sample sizes. Choose one that is most feasible.
  
  I hope this helps.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Krishna Subed February 26, 2019 at 12:50 PM

COULD YOU GIVE ME THE REFERENCE ABOUT WHEN WE CAN USE RELATIVE PRECISION AS %OF PROPORTION? THANK YOU IN ADVACNE

LikeLike

Reply ↓
1. drroopesh Post authorFebruary 27, 2019 at 12:15 PM
  
  Dear Krishna Subed,
  
  Please find the references at the end of the article on relative and absolute precision:
  
  https://communitymedicine4asses.com/2014/12/30/relative-and-absolute-precision-in-sample-size-calculation/
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Ernest Nwachukwu April 29, 2019 at 12:16 PM

Dear Dr. Roopesh,

I am carrying out a study to describe the mobility profile of community-dwelling older adults in a region with a population of about 146,647 older adults.

Please could you explain to me how to calculate the appropriate sample size for the study.

Kind regards.

LikeLike

Reply ↓
1. drroopesh Post authorApril 30, 2019 at 12:32 AM
  
  Dear Ernest,
  
  Please provide me with your research question in PICO format- that determines the study design.
  
  Thanks!
  
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Ernest Nwachukwu May 2, 2019 at 1:50 PM

Dear Dr Roopesh,

Here is my research question in PICO format:
What are the mobility profiles (patterns) of community-dwelling older adults in the southeastern part of Nigeria.

P: community-dwelling older adults
I: Test with Short Physical Performance Battery and 6 minutes walk test. Then interview with Preclinical disability scale and Lower Extremity Functional Scale.
C: none
O: mobility profiles (i.e. no mobility limitation, preclinical mobility limitation, mild mobility limitation, moderate mobility limitation or severe mobility limitation) or performance in the test.

I hope this helps you in guiding me through the calculation of the appropriate sample size. Please remember the population size of the older adults in this region is about 146,647.

Kind regards!

Ernest

LikeLike

Reply ↓
1. drroopesh Post authorMay 5, 2019 at 2:45 PM
  
  Dear Ernest,
  
  Please go through the article below for guidance on formulating a research question using the PICO criteria:
  https://communitymedicine4asses.com/2013/08/18/how-to-formulate-a-research-question-the-pico-criteria/
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
  1. Ernest Nwachukwu May 5, 2019 at 10:35 PM
    
    Dear Dr Roopesh,
    
    Thank you so much for the link. I have gone through two of the articles and I have come up with the following research question in PICO format:
    
    “Among community-dwelling older adults, how prevalent is pre-clinical disability?”
    
    I hope I got it right this time.
    
    Please help me on how to calculate the appropriate sample size for this study taking the population of older adults in this region to be 146,647.
    
    Kind regards!
    
    LikeLike
    
    Reply ↓
    1. drroopesh Post authorMay 6, 2019 at 12:39 PM
      
      Dear Ernest,
      
      Since your research question seeks to determine the prevalence of pre-clinical disability, the study design would be cross-sectional study.
      
      In order to estimate sample size, one would require the prevalence of pre-clinical disability in a similar population; or a rough estimate of prevalence from clinical experience.
      
      The formula 4pq/l^2 will yield the sample size for a cross sectional study, where
      p: prevalence of preclinical disability (in %)
      q: (100-p)
      l: relative precision (a proportion of p; up to a maximum of 20% of p).
      
      You could obtain values of p from various studies, and take the largest sample size that is practical for you.
      
      Please also go through the following:
      https://communitymedicine4asses.com/2018/06/23/how-to-calculate-sample-size-with-epi-info-7/
      
      Regards,
      Dr. Roopesh
      
      LikeLike
      
      Reply ↓
      1. Ernest Nwachukwu May 7, 2019 at 11:48 AM
        
        Dear Dr. Roopesh,
        
        This has been most helpful to me. I have already invited several of my friends carrying out research works to visit this site.
        
        Thanks a million times.
        
        Regards
        Ernest
        
        LikeLike
        
        Reply ↓
        
        drroopesh Post authorMay 7, 2019 at 1:38 PM
        
        Dear Ernest,
        
        I am glad to have been of help to you. Do visit again!
        
        Regards,
        Dr. Roopesh
        
        LikeLike
        
        Reply ↓
      2. Ernest Nwachukwu October 25, 2019 at 5:55 PM
        
        Dear Dr Roopesh,
        
        I am happy to visit this site again.
        
        I will like to know if there is a scholarly or widely accepted name for this formula for calculating sample size for cross-sectional studies: 4pq/l^2.
        
        Kind regards
        
        Ernest
        
        LikeLike
        
        Reply ↓
        
        drroopesh Post authorNovember 9, 2019 at 10:14 AM
        
        Dear Ernest,
        
        The formula doesn’t have a particular name. Nevertheless, it will be found in any biostatistics/ research methodology text dealing with sample size estimation.
        
        Regards,
        Dr. Roopesh
        
        LikeLike
        
        Reply ↓
Nilusha Gayan Mahakumbura May 18, 2019 at 4:00 AM

Dear Dr. Roopesh,

I’m carrying out a descriptive cross sectional study and i want to take a sample out of a finite population. What sample size calculating formulas are the best for that?

Thank you!

best regards!

LikeLike

Reply ↓
1. drroopesh Post authorMay 22, 2019 at 4:22 PM
  
  Dear Nilusha,
  
  Please see my response to Mira (31 Dec 2017) above regarding finite population.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Nneoma June 5, 2019 at 1:28 PM

Dear, Dr. Roopesh
I’m carrying out an experimental work on the effect of aerobic exercise on self esteem of overweight and obese youth in university and I need to get a good sample size calculation.
Thanks

LikeLike

Reply ↓
1. drroopesh Post authorJune 12, 2019 at 3:18 AM
  
  Dear Nneoma,
  
  Apologies for the delay in responding.
  
  Please provide me your research question in PICO format, as well as objectives.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Nneoma June 18, 2019 at 11:25 AM

Dear Roopesh
‘what effect does aerobic exercise have on self esteem and self perceived body image of overweight and obese undergraduate students ‘.
Sir this is my research topic in Pico format, it is an experimental study that involves two groups
Thank you
Nneoma

LikeLike

Reply ↓
1. drroopesh Post authorJune 20, 2019 at 2:53 AM
  
  Dear Nneoma,
  
  What is the comparison group? Are you comparing between overweight and obese students, or are they a single group that you will compare with another group (normal)?
  
  For calculation of sample size of an experimental study, you need to specify the type of RCT- superiority/ equivalence/ non-inferiority; and provide an estimate of the effect size (how much of a difference do you expect between the two groups?).
  
  Please share the link to the main reference article you wish to use estimates from.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Nneoma June 23, 2019 at 11:38 PM

Dear Dr. Roopesh
There are 2 groups, one undergoes exercise as intervention while the other is a control group. It is RCT that has both overweight and obese in each group. I’m really confused about the difference to expect from the two groups. As for the article there seems to be no similar research in my country.
Thanks
Nneoma

LikeLike

Reply ↓
Nneoma June 24, 2019 at 9:01 AM

Dear Dr. Roopesh
There are 2 groups, one undergoes exercise as intervention while the other is a control group. It is RCT that has both overweight and obese in each group. I’m really confused about the difference to expect from the two groups. As for the article there seems to be no similar research in my country. It is non inferiority RCT.
Thanks
Nneoma

LikeLike

Reply ↓
1. drroopesh Post authorJune 26, 2019 at 2:29 AM
  
  Dear Nneoma,
  
  Your outcome is self-esteem, which (I suspect) will be assessed by a tool that assigns scores. The difference in scores between those who undergo exercise (intervention) compared to those in the control arm is what I seek. You may obtain this information from existing studies (need not have been done in your setting, but must have similar study population (in terms of eligibility criteria)); or clinical observation (you may guess the difference based on observations from practice).
  
  I hope this helps.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
  1. Nneoma July 9, 2019 at 8:58 AM
    
    Dear Roopesh
    I still couldn’t make something out of it. please is there no other way to calculate non inferiority RCT, that involves only two groups. Please is there any calculator or formula for it. Kindly check.
    Meanwhile I found a similar study but there was no sample size calculation. participants were those who showed interest in the study, the study was done without any particular number of sample in mind.
    my regards,
    Nneoma.
    
    LikeLike
    
    Reply ↓
    1. drroopesh Post authorJuly 10, 2019 at 4:00 PM
      
      Dear Nneoma,
      
      Unfortunately, there is no alternative.
      
      However, I’ll try to simplify things for you:
      
      You are planning a RCT which has two arms- one of which receives exercise as the intervention, while the other is a control arm. The purpose is to see if the intervention affects self-esteem or not.
      
      If the process of randomization is done properly, both arms should be similar with respect to known and unknown confounders. In simple terms, randomization will cause the overweight and normal individuals to be distributed uniformly in both arms (this way, their influences will get cancelled out).
      
      Assuming you plan to use the Rosenberg self-esteem scale, you will possibly administer the tool before the start of intervention to determine baseline self-esteem scores. If randomization has been performed well, there shouldn’t be a significant difference between the two arms’ self-esteem scores.
      
      Next, you require the intervention arm to exercise for specified duration and intensity, while the control arm doesn’t. After some time you will stop the intervention. At this point, you will possibly measure the self-esteem scores once again.
      
      Unless self-esteem naturally declines with time, there should not be a significant difference between the two measurements of the control arm. However, there should be a difference between the intervention arm and control arm. It is the magnitude of this difference that is required for computation of sample size.
      
      Continuing with the Rosenberg self-esteem scale as our example, the scale has a maximum score of 30, with values less than 15 indicating low self-esteem. What you need to do is guess the scores before and after intervention between the two arms. You may find values reported by other researchers in general population. These could be taken as the baseline score in ordinary people. Ask yourself if the scores in your study population are likely to be higher or lower than those values. Take an educated guess and determine a value. Don’t worry too much about it being very accurate- it should be okay as long as you aren’t completely off the mark. This value is your baseline score (estimated). Now guess how much difference to self-esteem scores the intervention is likely to make over the duration of the study. Assume there is no change in the control arm. What is the difference between the scores of the intervention arm and the control arm? This is the difference you need to supply for calculation of sample size.
      
      I hope this helps.
      
      Regards,
      Dr. Roopesh
      
      LikeLike
      
      Reply ↓
      1. Nneoma July 26, 2019 at 8:25 AM
        
        Dear Roopesh
        I’m very sorry for the late reply
        jhrba.com › articles
        The Effects of Physical Activity on Self-Esteem: A Comparative Study
        I hope this is worth it.
        Regards
        Nneoma
        
        LikeLike
        
        Reply ↓
        
        Nneoma July 26, 2019 at 8:30 AM
        
        https://www.google.com/url?sa=t&source=web&rct=j&url=http://jhrba.com/en/articles/13221.html&ved=2ahUKEwjzgOaBjNLjAhUZHcAKHXz7DCYQFjABegQIARAB&usg=AOvVaw09rtKGB5HD72suzckaNxUS
        I think this link is better.
        
        LikeLike
        
        Reply ↓
        
        drroopesh Post authorJuly 28, 2019 at 10:47 AM
        
        Dear Nneoma,
        
        Based on the details in the article, and assuming you intend to conduct a parallel trial with equal allocation, the estimated sample size (power 80%, alpha error 5%) would be 13 subjects in each arm.
        
        Regards,
        Dr. Roopesh
        
        LikeLike
        
        Reply ↓
        
        Nneoma August 6, 2019 at 7:40 AM
        
        Dear Roopesh
        Thank you so much for your help, but I still need the formula or the textbook or even a link so that I will be able to reference it. Thank you once again.
        Regards,
        Nneoma.
        
        LikeLike
        
        Reply ↓
        
        drroopesh Post authorAugust 8, 2019 at 3:02 PM
        
        Dear Nneoma,
        
        Please mail me at communitymedicine4asses@yahoo.com for the above.
        
        Regards,
        Dr. Roopesh
        
        LikeLike
        
        Reply ↓
monaelesely June 24, 2019 at 7:53 PM

I’m starting cross sectional interventional study to detect the effect of kinesiotape on proprioception post ACL reconstruction surgery. i can’t find relative literature . how can I calculate the sample size?

LikeLike

Reply ↓
1. drroopesh Post authorJune 26, 2019 at 2:34 AM
  
  Dear Monaelesely,
  
  A cross-sectional study will be inappropriate if you wish to establish/ investigate causality. A longitudinal study (or at least a pre-post design) is desirable.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
monaelesely June 26, 2019 at 5:35 AM

Dear Dr. Roopesh,

i found this study, its almost same as the design that i want to do but they used convenient sample.

https://www.researchgate.net/publication/325275331_EFFECT_OF_KINESIOTAPING_ON_PROPRIOCEPTION_IN_PATIENTS_POST_ANTERIOR_CRUCIATE_LIGAMENT_RECONSTRUCTION_SURGERY

i would like to know if it is possible to do calculation for sample size for my study as it almost similar?

appreciation
Monaelesely

LikeLike

Reply ↓
1. drroopesh Post authorJune 28, 2019 at 12:05 PM
  
  Dear Monaelesely,
  
  The study has several flaws, the first being the study design. A cross-sectional study is one where each subject contributes a single observation only. In the study, subjects contributed more than one measurement. The study would be best described as quasi-experimental.
  
  There should have been controls in order to establish that the change observed was on account of the tape, and not the subjects practicing tasks before the second measurement. As such, there is a strong risk of bias.
  
  Convenient sampling further limits the generalizability of findings, since the sample wouldn’t be representative of the population from which it was drawn.
  
  The sample size calculation would have to be for a non-inferiority RCT, not a cross-sectional study.
  
  I recommend that you conduct a proper Randomized Controlled Trial, avoiding the errors committed by the authors of the article.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Aremu Olalekan September 29, 2019 at 10:09 PM

hello Dr. Roopesh,
i am working on a dissertation titled “assessment of quality of life and functional vision in children with visual impairment”. it is a cross sectional descriptive study. i will like to know the appropriate sample size i can use for this study and how do i go about calculating the sample size.
Thank you

LikeLike

Reply ↓
1. drroopesh Post authorSeptember 30, 2019 at 1:24 AM
  
  Dear Aremu,
  
  Please state your objectives in PICO format. Then use the formula 4*pq/ l^2 mentioned in the article, using prevalence values from existing literature as ‘p’. ‘q’ is simply (1-p); and ‘l’ is 20% of ‘p’.
  
  Please go through the comment thread for details. If you still have doubts, feel free to let me know.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
iqra ishrat December 19, 2019 at 1:04 AM

hi dear dr. Roopesh
i am working on title ”correlation between serum albumin levels and grades of esophageal varices in patients with chronic liver disease”. its a cross sectional study but i don’t know how to calculate the sample size in this study? in different countries the prevalence of chronic liver diseases is different. in US 2million annually death occur and in china its about 400,000 patients die annually.
could you please guide me
thankyou.

LikeLike

Reply ↓
1. drroopesh Post authorDecember 19, 2019 at 1:00 PM
  
  Dear Iqra,
  
  The calculation of sample size involves the estimation of a range of sample sizes, then choosing the most appropriate value based on feasibility, etc.
  
  What you mention are the absolute number of deaths due to chronic liver disease.
  
  What you need to calculate the sample size is the proportion of population with esophageal varices in chronic liver disease.
  
  I will be better able to guide you if you provide your study objectives, study population and research question (in PICO format).
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Kaustav jain November 9, 2020 at 6:13 PM

hello dr roopesh,

I am doing a study on TO DETERMINE THE RELATIONSHIP BETWEEN FRONTAL SINUS PNEUMATIZATION AND DIFFERENT ANATOMIC VARIANTS OF PARANASAL SINUSES ON MAXILLOFACIAL CT.

I would be taking maxillofacial CT scan of random patients and classify them on the basis of frontal sinus morphology (on CT scan) into 3 groups (aplasia/ hypoplasia, medium and hyperplasia
Then in each of the 3 groups look for variations(on CT scan) like Upper and middle concha pneumatisation , internal carotid artery dehiscence, nasal septal deviation etc.

Since i am doing this on normal individuals just correlating it between normal structures whether they coexist or not so there is no prevalence…(prevalence can be found in literature like prevalence of full pneumatization of frontal sinus with deviated nasal septum….but i would be dividing patients into 3 groups and look for multiple things in 1 group

Pls help me for taking approprite sample size or should i just take p as 0.5 and calculate sample size.
Thank you.

LikeLike

Reply ↓
1. drroopesh Post authorNovember 11, 2020 at 11:07 AM
  
  Dear Kaustav,
  
  What is your research question, and what are your objectives? Sample size must be calculated for each objective separately.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Abhijit Das December 10, 2020 at 10:09 AM

Dear Dr. Roopesh Sir,
One article on “Sample size calculation for agreement study, particularly cohen’s kappa estimation” will be beneficial.
Thank you, sir.

LikeLike

Reply ↓
1. drroopesh Post authorDecember 10, 2020 at 5:58 PM
  
  Dear Abhijit Das,
  
  Thank you for the suggestion. I will write an article on that in the near future.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Davidson June 24, 2021 at 5:01 AM

I am a bit confused on the sample size estimation formula to use for a study to determine the knowledge and practice of first aid by lay people. Do I use the formula for quantitative or qualitative for cross-sectional studies. Also I have not come across ‘prevalence’ of first aid practice. How does that fit into my calculation.

LikeLike

Reply ↓
1. drroopesh Post authorJune 24, 2021 at 9:27 PM
  
  Dear Davidson,
  
  The sample size calculation for cross-sectional studies uses the same formula.
  
  Perhaps the following articles may be of help to you:
  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5322636/
  https://pubmed.ncbi.nlm.nih.gov/27727043/
  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7446818/
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓
Pingback: Sample Size Calculation: The Essentials (Part 1) | communitymedicine4all
Pingback: Sample Size Calculation: The Essentials (Part 2) | communitymedicine4all
Talha November 26, 2021 at 8:08 PM

Dear Dr. Roopesh,
I’m writing a synopsis titled “Efficacy Of Selective Laser Trabeculoplasty (SLT) in Primary Open Angle Glaucoma in Patients on a Single or No Topical Drug Regimen” , in which my objective is to measure the fall in intra ocular pressure of the patients following SLT laser at 1,3 and 6 months interval. My understanding is that it’s a single armed Quasi- experimental study design for the synopsis. My 2 questions are
1. what should be an appropriate sampling technique for this study?
2. what formula can be used for calculating sample size?

Thankyou.

LikeLike

Reply ↓
1. drroopesh Post authorNovember 26, 2021 at 8:44 PM
  
  Dear Talha,
  
  From your description I understand that there are two arms- single topical drug and no topical drug regimen. Please provide your research question in PICO format so that I may determine the appropriate study design. It is not evident that your study is a quasi-experimental design from your description.
  
  Regards,
  Roopesh
  
  LikeLike
  
  Reply ↓
  1. Dr. Talha Nafees November 28, 2021 at 11:41 PM
    
    Dear Dr. Roopesh,
    Following is my understanding of PICO for synopsis.
    
    P(patient/population) = patients of primary open angle glaucoma visiting outpatient department of hospital.
    I(intervention) = SLT laser will be applied to the patient eyes after recording their intra-ocular pressure(IOP)
    C(comparison)= compared to the (IOP) of the patients before SLT laser application.
    0(outcome) = decrease in IOP of the patients following SLT laser.
    
    Thankyou
    
    LikeLike
    
    Reply ↓
    1. drroopesh Post authorNovember 29, 2021 at 5:13 AM
      
      Dear Talha,
      
      Please frame the elements of PICO into a question- the way you frame the question will determine the study design.
      
      Regards,
      Dr. Roopesh
      
      LikeLike
      
      Reply ↓
      1. Dr.Talha Nafees November 29, 2021 at 11:35 PM
        
        Dear Dr. Roopesh,
        thankyou so much for your kind help. I’ve determined its quasi experimental study.
        One last thing about sample size , is this formula okay for calculation sample size in this study
        Sample size n = [DEFF*Np(1-p)]/ [(d2/Z21-α/2*(N-1)+p*(1-p)]
        thankyou sir.
        
        LikeLike
        
        Reply ↓
        
        drroopesh Post authorNovember 30, 2021 at 6:25 PM
        
        Dear Dr. Talha,
        
        There are different types of quasi-experimental study designs and analytic approaches differ for each. Please note that a single group pretest posttest design without control is a poor design with many challenges to internal and external validity.
        
        Regarding the sample size formula you have mentioned, it is appropriate for cluster studies with two groups. Typically, such studies are population based, not hospital based. Since you have only one group, the formula is inappropriate.
        
        For a single group pretest-posttest design without control the sample size formula is
        
        n = 2 + (Z1-α/2 + Z1-β)^2 * S^2/d^2
        
        where S = Standard deviation,
        d= Relative precision
        
        Thank you for your patience.
        
        Regards,
        Roopesh
        
        LikeLike
        
        Reply ↓
        
        Anis January 21, 2023 at 11:40 AM
        
        Dear Dr. Talha Nafees,
        If you don’t mind, I want to know about the name of the formula for sample size that you mentioned before. What is the name of that formula?
        
        Thank you
        
        LikeLike
        
        Reply ↓
Seraj November 28, 2022 at 9:59 PM

Hello Dr Roopesh
Im doing a cross sectional study and there havent been any studies done regarding the topic im working on
So i can determine the prevalence
Is there any way where i can measure the sample size needed?

LikeLike

Reply ↓
1. drroopesh Post authorDecember 1, 2022 at 11:50 AM
  
  Dear Seraj,
  
  If there are no published studies, you can conduct a pilot survey to obtain an estimate of the prevalence, then use that for calculation.
  
  Regards,
  Dr. Roopesh
  
  LikeLike
  
  Reply ↓