********************************************************************************* *************** ZA 2010 conversion to MTUS X ************************************ ********************************************************************************* use "ZA2010_activities.dta" , clear *makes data view easier to look at format Activity_code %8.0g format Sametime %8.0g format Location1 %8.0g format Location2 %8.0g format Q22HighestSchool %8.0g format Q23MaritalStatus %8.0g order hldid persid Timeslot slot sort hldid persid Timeslot slot *flaging slots when new activity starts bysort hldid persid : gen epact = Activity_code if (Activity_code[_n] != Activity_code[_n-1]) *this code creates a sequence variable for episode number *-------------- preserve drop if epact == . bysort hldid persid: gen ep_n = _n /*create episode sequence number*/ sort hldid persid Timeslot Act keep hldid persid Timeslot Act ep_n save "ep_n.dta" restore *-------------- *merge new episode sequence number into working file merge 1:1 hldid persid Timeslot Act using "ep_n.dta" drop _merge *delete file just merged rm "ep_n.dta" *fill in blanks for new episode sequence number replace ep_n = ep_n[_n-1] if ep_n[_n] == . *fill in blanks for new activity variable (you can actually delete this and use the old one) replace epact = epact[_n-1] if epact[_n] == . drop slot dur total_time replace time = 7.5 if Timeper == 7.5 & time == 7 //somehow the time variable was 7 and should be 7.5 *this created new episode file with unique case for each new episode *note rule (first) is applied to all variables except 'time' which is summed (this is the episode duration variable) collapse (first) Timeslot Act Q116Gender Q117Age Q118Population Q22HighestSchool /// Q23MaritalStatus Activity_code Sametime Location1 Location2 Fulltime /// Activ Timeper Weight sec unid epact (sum) time , by(hldid persid ep_n) *check all is well with the world by computing total time variable bysort hldid persid: egen tottime = sum(time) ta tottime * epnum * gen epnum = ep_n * clockst * gen clockst = . replace clockst = 4.0 if Timeslot==1 replace clockst = 4.3 if Timeslot==2 replace clockst = 5.0 if Timeslot==3 replace clockst = 5.3 if Timeslot==4 replace clockst = 6.0 if Timeslot==5 replace clockst = 6.3 if Timeslot==6 replace clockst = 7.0 if Timeslot==7 replace clockst = 7.3 if Timeslot==8 replace clockst = 8.0 if Timeslot==9 replace clockst = 8.3 if Timeslot==10 replace clockst = 9.0 if Timeslot==11 replace clockst = 9.3 if Timeslot==12 replace clockst = 10.0 if Timeslot==13 replace clockst = 10.3 if Timeslot==14 replace clockst = 11.0 if Timeslot==15 replace clockst = 11.3 if Timeslot==16 replace clockst = 12.0 if Timeslot==17 replace clockst = 12.3 if Timeslot==18 replace clockst = 13.0 if Timeslot==19 replace clockst = 13.3 if Timeslot==20 replace clockst = 14.0 if Timeslot==21 replace clockst = 14.3 if Timeslot==22 replace clockst = 15.0 if Timeslot==23 replace clockst = 15.3 if Timeslot==24 replace clockst = 16.0 if Timeslot==25 replace clockst = 16.3 if Timeslot==26 replace clockst = 17.0 if Timeslot==27 replace clockst = 17.3 if Timeslot==28 replace clockst = 18.0 if Timeslot==29 replace clockst = 18.3 if Timeslot==30 replace clockst = 19.0 if Timeslot==31 replace clockst = 19.3 if Timeslot==32 replace clockst = 20.0 if Timeslot==33 replace clockst = 20.3 if Timeslot==34 replace clockst = 21.0 if Timeslot==35 replace clockst = 21.3 if Timeslot==36 replace clockst = 22.0 if Timeslot==37 replace clockst = 22.3 if Timeslot==38 replace clockst = 23.0 if Timeslot==39 replace clockst = 23.3 if Timeslot==40 replace clockst = 0.0 if Timeslot==41 replace clockst = 0.3 if Timeslot==42 replace clockst = 1.0 if Timeslot==43 replace clockst = 1.3 if Timeslot==44 replace clockst = 2.0 if Timeslot==45 replace clockst = 2.3 if Timeslot==46 replace clockst = 3.0 if Timeslot==47 replace clockst = 3.3 if Timeslot==48 * start & end * gen start=. replace start = 0 if Timeslot==1 gen end =. replace end = start+ time replace start = end[_n-1] if hldid== hldid[_n-1] & persid ==persid[_n-1] replace end = start+ time if end==. bys hldid persid: egen max_end = max(end) fre max_end * activity codes * * main/ sec* gen main=Activity_code recode main (10=2) (30 90=4) replace main = 5 if acta ==20 & (loca ==3 | loca ==4) recode main (41 42 43 48 = 25) (50=55) (60 820=34) (80=67) recode main (111 = 7) recode main (115 130 190 210/260 290 310 320 330 340 350 360 370=777) recode main (112=8) replace main = 8 if main == 777 & loca==1 replace main = 7 if main == 777 & (loca ==3 | loca==4) recode main (113 = 23) recode main (114= 13) recode main (140=12) recode main (150=14) recode main (180=63) recode main (280=11) recode main (370 390=9) recode main (410=18) recode main (420=20) recode main (430 = 21) recode main (440=24) recode main (441 448=26) recode main (450 460 490=22) recode main (470=27) recode main (480=67) recode main (511 512 590= 28) recode main (491=22) recode main (521 522 =29) recode main (531 532 561 562=31) recode main (540 550 673=32) (580 =66) (610 615 620 630 650 660 671 672 674 690=33) recode main (680=65) recode main (710=15) (720=16) (730 740 790 =17) (780=64) recode main (810 =40) (832=48) (831 833=49) (840=52) (870=35) (850=42) (860 890=50) recode main (880=68) (910 = 56) (920 990=59) (930=58) (940=61) (950=38) (980=68) replace main =7 if main ==777 & loca !=1 recode main (380=11) recode main (88 488=67) (188=63) (288 388=68) (588=66) (788=64) (888 = 68) * sec* replace sec = 69 if acta ==69 recode sec (10=2) (30 90=4) replace sec = 5 if acta ==20 & (loca ==3 | loca ==4) recode sec (41 42 43 48 = 25) (50=55) (60 820=34) (80=67) recode sec (111 = 7) recode sec (115 130 190 210/260 290 310 320 330 340 350 360 370=777) recode sec (112=8) replace sec = 8 if main == 777 & loca==1 replace sec = 7 if main == 777 & (loca ==3 | loca==4) recode sec (113 = 23) recode sec (114= 13) recode sec (140=12) recode sec (150=14) recode sec(180=63) recode sec (280=11) recode sec(370 390=9) recode sec(410=18) recode sec (420=20) recode sec (430 = 21) recode sec (440=24) recode sec (441 448=26) recode sec (450 460 490=22) recode sec (470=27) recode sec (480=67) recode sec (511 512 590= 28) recode sec (491=22) recode sec (521 522 =29) recode sec (531 532 561 562=31) recode sec (540 550 673=32) (580 =66) (610 615 620 630 650 660 671 672 674 690=33) recode sec (680=65) recode sec (710=15) (720=16) (730 740 790 =17) (780=64) recode sec(810 =40) (832=48) (831 833=49) (840=52) (870=35) (850=42) (860 890=50) recode sec (880=68) (910 = 56) (920 990=59) (930=58) (940=61) (950=38) (980=68) replace sec = 8 if sec ==777 & loca ==1 recode sec (380=11) recode sec (88 488=67) (188=63) (288 388=68) (588=66) (788=64) (888 = 68) recode sec (777=7) recode sec (.=-8) * av * gen av=acta recode av (111 113 115 130 140 190 310 320 330 340 350 360 370 390=1) recode av (112 150=2) //av3=-8// recode av (710 730 740=4) recode av (180 380 780=5) recode av (410=6) recode av (420 430=7) recode av (236 240 250 450 460 470 490 491 540 673=8) recode av (210 220 230 290=9) recode av (260 440 441 448=10) recode av (511 512 521 522 561 562 590 671 672=11) recode av (80 280 480 531 532 550 580=12) recode av (30=13) recode av (41 42 43 48 90=14) recode av (20=15) recode av (10=16) recode av (680 880 980=17) recode av (870 950=18) recode av (850=19) //av 320 21 24 26 27 28 32 37 38 39= -8// recode av (60 820=22) recode av (114 610 615 620 630 650 660 674 690=23) recode av (810 890=25) recode av (831 832 833=29) recode av (930=30) recode av (920 990=31) recode av (720 790 940=33) recode av (910=34) recode av (50=36) recode av (840 860=40) recode av (-7 -6 -8=41) recode av (88 488=17) (188=5) (288 388 588 788 888 = 17) *core25* gen core25 = main replace core25 = 1 if main ==2 | main==3 replace core25 = 2 if main ==5 | main==6 replace core25 = 3 if main ==1 | main==4 recode core25(7 8 9 10 11 12 13 14 =4) (15 16 17 =5) /// (18 19=6) (20 21 23 = 7) (22=8) (24 25 26=9) (46=10) (27 47=11) (32=12) (28 31=13) (29 30=14) /// (34=15) (33=16) (63 64=17) (62 65 66 67 68=18) (42 43 44=19) (57 58 59=20) (56=21) /// (60 61=22) (35 36 37 38 39 40 41 45=23) (48 49 50 51 52 53 54 55=24) (69=25) ************ * location * ************ * eloc * gen eloc = loca recode eloc (3 4 =3) (5=4) (7=8) (6 8=9) * mtrav * gen mtrav=Location2 recode mtrav (4 5 =1) (6 7 =2) (8=4) (9=5) * ict * gen ict = 1 if acta ==940 | sec==940 recode ict (.=0) * co-presence * gen alone=-8 gen child=-8 gen sppart = -8 gen oad = -8 * id * gen id=1 * diary * gen diary=1 * inout* gen inout= Location2 recode inout (1=1) (2 3 8=2) (4/7 9=3) * animal * gen animal = 1 if main==27 | sec==27 replace animal = 0 if animal !=1 *********************************** * BACKGROUND VARIABLES * *********************************** * country * gen country = "ZA" * survey * gen survey = 2000 *swave* gen swave=0 * msamp* gen msamp=0 * sex * gen sex = Q116Gender * age * gen age =Q117Age * badcase * bys hldid persid: egen max_epi = max(epnum) gen badcase =. replace badcase = 4 if max_epi <=7 replace badcase = 1 if sum_mis >90 replace badcase = 0 if badcase==. gen mis=time if main==. bys hldid persid: egen sum_mis = sum(mis) * nowght * gen nowght = 1 if badcase !=0 ***************************************************************** * generating & merging in background variables from other files * ***************************************************************** * from person.dta * destring persid, replace merge m:1 hldid persid using tus-2010-person-v1-20140409.dta, /// keepusing(Q22HighestSchool Q23MaritalStatus /// Q118Population Q24Spouse Q23MaritalStatus Q26Child18Alive /// Q31PdWrk Q41Occupation Q46TotIncome Occup Province /// Education_Status Weight civstat cohab nchild partid agekidx /// agekid2 day cday month year diary ocombwt workhrs empstat emp /// unemp edcat sector region empinclm occupo educa ethnic /// migrantd migrantm migrantf isco1 retired student) merge m:1 hldid persid using tus-2010-person-v1-20140409.dta, /// keepusing(civsum) * day * gen day =Q52DayDiary recode day (7=1) (1=2) (2=3) (3=4) (4=5) (5=6) (6=7) * cday * tostring Q51DateDiary, gen(date_str) gen str cday = substr(date_str, 1, strlen(date_str) - 6) destring cday, replace * month * gen str month = substr(date_str, -6,2) destring month, replace * year * gen year=2010 * diary * gen diary=1 * nowght * * parntid1 parntid2 * gen parntid1=-8 gen parntid2=-8 *partid * gen partid_n = Q25Spouse gen partid = -8 *hhtype * bys hldid: egen civsum= total(civstat) gen hhtype=. replace hhtype = 1 if hhldsize==1 replace hhtype=2 if hhldsize==2 & civsum>=1 replace hhtype=3 if hhldsize>=3 & civsum>=1 replace hhtype=4 if hhldsize>=3 & civsum==0 *famstat * gen famstat = 0 if (age >=18 & age <=39) & nchild ==0 replace famstat =1 if age >=18 & agekidx==2 // under 7 replace famstat =2 if age >-18 & agekidx==3 // under 18, over 7 replace famstat =3 if age >=40 & nchild==0 replace famstat =5 if age <18 & famstat ==. replace famstat = *hhldsize * gen hhldsize =Q122NumberEligible * nchild * gen nchild =Q27Child18HH * agekidx * gen agekidx = . replace agekidx = 2 if Q29Child06HH>=1 & Q29Child06HH!=88 replace agekidx = 3 if agekidx!=2 & Q27Child18HH<=9 & Q29Child06HH!=0 * agekid2 * gen agekid2=-8 *income * // total hh income grouped // gen income =. replace income=-7 if incorig==0 | incorig==1 replace income = 1 if incorig==2 | incorig==3 | incorig==4 | incorig==5 replace income = 2 if incorig==6 | incorig==7 replace income = 3 if incorig>7 & incorig <13 replace income = -8 if incorig>=13 *ownhome * gen ownhome=-8 *urban * gen urban=-8 *computer* gen computer=1 if Q12Computer==1 replace computer=0 if Q12Computer==2 *vehicle * gen vehicle=1 if Q12Car==1 recode vehicle = 0 if Q12Car==2 *singpar * gen singpar=-8 *relrefp* gen relrefp=-8 *civstat* gen civstat=. replace civstat =1 if Q23MaritalStatus==1 | Q23MaritalStatus==2 replace civstat = 2 if Q23MaritalStatus>=3 * cohab * gen cohab =. replace cohab= -7 if civstat==2 replace cohab=1 if Q23MaritalStatus==2 replace cohab=0 if Q23MaritalStatus==1 *citizen* gen citizen =-8 * whereborn* gen whereborn=-8 * empstat* gen empstat = 3 if Status==1 replace empstat = 4 if Status==2 | Status==3 replace empstat=-8 if empstat==. * emp* gen emp=1 if Status==1 replace emp=0 if emp==. * unemp* gen unemp=1 if Status ==2 replace unemp=0 if unemp==. *student* gen student=-8 *retired* gen retired= 1 if Q45SourceIncome==3 | Q45SourceIncome==6 replace retired=0 if retired !=1 *empsp* gen empsp=-8 gen empsp_n=. gen empsp_help = 1 if civstat ==1 //flagging partners// replace empsp_help = empstat if empsp_help ==1 *on consecutive partners (one after another* gen civstat_help=1 if partid>=1 bys hldid: replace empsp = empsp_help[_n-1] if civstat_help !=. //uwaga: rekoduje tez w drugim HH jesli 2 HH obok siebie!// bys hldid: replace empsp = empsp_help[_n+1] if civstat_help !=. & empsp_help[_n+1] !=. & hldid == hldid[_n+1] & empsp ==. * if partners on different numbers within HH (split by one)* bys hldid: replace empsp = empsp_help[_n-2] if empsp_help[_n-1]==. & civstat_help !=. & empsp_help[_n-2] !=. & empsp ==. bys hldid: replace empsp = empsp_help[_n+2] if empsp_help[_n+1]==. & civstat_help !=. & empsp_help[_n+2] !=. & empsp ==. * other cases that cannot be captured by macro * *workhrs * gen workhrs = Q43Hours *isco1 * gen isco1=Occup recode isco1 (10=9) *sector* gen sector=-8 * edcat* gen edcat = Education_Status recode edcat (1 2 3 4=1) (5=2) (6 7=3) *rushed * gen rushed=-8 *health * gen health=-8 *carer * gen carer=-8 *disab* gen disab=-8 * ocombwt * gen ocombwt = Weight *propwt* egen count=count(badcase) egen countp_bis=count(badcase) if badcase==0 egen countp=min(countp_bis) sort sex age day egen daywt = group(sex age day) if badcase==0 egen ngroupsd=max(daywt) if badcase==0 by sex age day: egen daycount2=count(daywt) if badcase==0 sort sex age egen weekwt=group(sex age) if badcase==0 by sex age: egen wkcount2=count(weekwt) if badcase==0 gen propwt=((wkcount2/7)/daycount2) if badcase==0 replace propwt=0 if badcase!=0 replace propwt=propwt*(count/countp) egen mean_weight=mean(propwt) replace propwt=propwt/mean_weight ************************ * from household.dta * ************************ merge m:1 hldid using tus-2010-household-v1-20140409.dta, keepusing(computer /// vehicle incorig region hhldsize) ************************* * preparing final files * ************************* * drop total time under 1440 * drop if max_end !=1440 ******** * MAIN * ******** keep country survey swave msamp hldid persid id day cday month year diary nowght /// time clockst start end epnum main core25 sec av inout eloc ict mtrav alone child sppart oad /// animal parntid1 parntid2 partid hhtype hhldsize nchild agekidx agekid2 income ownhome urban computer /// vehicle sex age famstat singpar relrefp civstat cohab citizen whereborn empstat emp unemp student retired empsp /// workhrs isco1 sector edcat rushed health carer disab ocombwt propwt order country survey swave msamp hldid persid id day cday month year diary nowght /// time clockst start end epnum main core25 sec av inout eloc ict mtrav alone child sppart oad /// animal parntid1 parntid2 partid hhtype hhldsize nchild agekidx agekid2 income ownhome urban computer /// vehicle sex age famstat singpar relrefp civstat cohab citizen whereborn empstat emp unemp student retired empsp /// workhrs isco1 sector edcat rushed health carer disab ocombwt propwt ********** * SURVEY * ********** * from person.dta * * region * gen region = Province label val region Province * incorig * gen incorig = Q113Income label val incorig Q113Income * empinclm * gen empinclm = Q46TotIncome label val empinclm TotIncome * occupo * gen occupo = Occup label val occupo Occup * educa * gen educa = Education_Status label val educa Education_Status * ethnic * gen ethnic = Q118Population label define ethnic /// 1 "African/Black" /// 2 "Coloured" /// 3 "Indian/Asian" /// 4 "White" /// label val ethnic ethnic * migrantd * gen migrantd = -9 * migrantm * gen migrantm=-9 * migrantf * gen migrantf=-9 * data prep * keep if epnum==1 keep country survey swave msamp hldid persid incorig region empinclm occupo /// educa ethnic migrantd migrantm migrantf sort country survey swave msamp hldid persid incorig region empinclm occupo /// educa ethnic migrantd migrantm migrantf