Methodology - NK Leadership Tracker | NK PRO
October 05, 2024

Methodology – NK Leadership Tracker

The North Korean Leadership Tracker is a database that tracks every appearance, including on-the-spot guidance inspections and diplomatic meetings, that Kim Jong Il (KJI) and Kim Jong Un have ever made since 1994.  NK News intends the Leadership Tracker to provide researchers data in an area where it is hard to come by. Due to the pre-eminence of the Supreme Leader in the North Korean system, tracking his appearances can provide useful clues about North Korean policy and the positioning of elites within the regime. Though this information does not provide hard and fast answers to pressing questions on North Korea’s internal workings, there is no doubt that it will be useful in helping solve many of the political puzzles surrounding the country. The data is therefore only a beginning point, not an endpoint, and will be most useful when paired with other event data in order to pick out patterns of North Korean behavior.

The dataset revolves around the on-the-spot guidance inspections, art performances, diplomatic meetings, and other public appearances that were made by Kim Jong Il during his time in power (and since his death, those of Kim Jong Un). Research assistants from NK News coded:

  • Each event that the two leaders appeared at from 1994 to present day
  • The names of those elites who also appeared at the event (names were entered in the order that they appeared in the story)
  • Where the event was located (if identified)
  • A description of the event and type

Numbers show that Kim Jong Il appeared at over 2,000 different locations with over 200 different elites between July 1994 and December 2011. These events covered a wide range, including visits to military units, watching art performances, diplomatic excursions to Russia and China, and more idiosyncratic visits such as to an ostrich farm, fun fair and the zoo.

Event types were coded as arts / cultural (events such as performances not of a military nature), diplomatic (any event featuring a diplomat or foreign leader or any event outside of North Korea), economic (any event related to economic activities or functions), military (any visit to a military base or performances by military units), political (events such as political conferences, visits to Kim Il Sung’s memorial, etc.) or other (events that do not fit the previous criteria, such as a visit to a funfair).

Inclusiveness and Accuracy

Data was collected from Korean Central News Agency (KCNA) reports published online through www.kcna.kp and www.kcna.jp. All reports published in English were included and took primacy in coding. Each story was recorded in an Excel database by one coder.

In line with the methodology outlined above, accuracy was of primary concern. Coders were given very specific instructions. Eight coders initially covered KCNA reports from 1997 – 2011. As an additional check, data was cross-referenced with the Ministry of Unification almanac of Kim Jong Il visits (Kim Jong Il Hyunjijido Donghyang 1994-2011) to ensure no reports were missed. If an event/report was listed in the almanac that the dataset did not have, the data would be incorporated as proscribed by the methodology.

The English versions of reports were given primacy over other languages. In the event a report was available but not in English (or if the Korean version contained additional data not found in the English version), it was included in the database with the aid of a translator.

The project leader manually reviewed the dataset from January to October 2012. During this time, data was constantly updated and various errors fixed. At first this consisted mostly of standardizing elite and provincial names and ensuring there were no typos. Afterwards, as location data was added to the database, the opportunity was taken to also double check much of the other information and correct any errors.

Prior to launch, the team made a decision to systematically verify the accuracy of the data. On the 29 June 2012, the dataset contain 10,192 observations across 17 recorded variables. Treating the database as the population, the team decided that systematically checking 700 entries would be sufficient to test the overall accuracy of the data and provide an additional layer of confidence to users.

The team decided that 95% accuracy would be sufficient for launch of the North Korea Leadership Tracker. The sample size of 700 observations/11,900 coded entries was chosen as it was large enough to be a statistically accurate reflection of the dataset and would provide a confidence interval of ± 3.57% with a confidence level of 95% for each observation as a whole and ±<1% for each variable entry.

The 700 cases were selected at random using the online tool http://www.random.org/. A log sheet was also created to ensure each case was checked systematically and coding errors could be analyzed. The project leader checked the 700 random sample with the original reports from KCNA and kept a log of accuracy for each variable.

Results

Across the 11,900 individual entries, 68 were miscoded, giving 99.43% (±<1%) accuracy sig, 0.05.

In total, 61 observations were found to contain inaccuracies from the 700, giving 91.4% (± 3.57%) accuracy sig. 0.05. However, a significant number of these errors, 22, were related to date format, with coders entering date of the report rather than date of the event. Suggesting that elimination of this error alone would increase accuracy to 94.42%.

Given the high accuracy of individual entries and the most frequent miscoding, incorrect date, the decision was made to continue with the launch of the project as there was limited distortion on data accuracy and analysis.

However, two conclusions were drawn from the exercise.

  1. A systematic review of each observation should continue in near future which will run in parallel to data collections
  2. The methodology was lacking inter-coder reliability checks and error logs. This would be added to the methodology process as of August 2012.

Reliability and Consistency

No inter-coder reliability tests were carried out during the first phases of the data collection process. Researchers were reasonably confident that the coding guidelines were applied consistently by coders and the Project Leader worked extensively with the coders and on data verification. There was limited room for idiosyncratic categorization by coders; mostly at risk in sector type and location, but also in unusual place and person names. However, after sampling it was decided that greater efforts would be made to systematize the coding checks and provide a check against coder fatigue, poor training or ‘rogue’ coders.

Intercoder reliability tests were chosen as a method to increase objectivity and reliability. Given limited subjectivity in the coding, designing the test was relatively simple and as such relatively high scores in the intercoder agreement coefficient, over 80% were only considered acceptable.

The following equation was used:

 

Where PAo is the proportion agreement observed, A is the number of agreements between two coders, and n1, n2 are the respective number of items coded by each of two coders.[1]

The intercoder reliability test was carried out as follows.

  1. Training of coders on methodology as a group
  2. Coders independently enter date for real life examples
  3. Intercoder reliability is calculated using the above equation
  4. If PAOis equal to or greater that 90% coders begin bulk data entry
  5. If PAO is less than 90% coders are brought together to discuss discrepancies and Step 1 through 3 is repeated and intercoder reliability tests continue
  6. If problems persist, methodology and codebook will be revisited.

Results from the first batch of reliability test carried out in early September found the proportion of agreement to be 69%, subsequent redrafting of the coding guide and reiteration of steps 1 through 3 raised the proportion of agreement to 80%.

70-80% would be considered fair in most text books on content  analysis.  When there is room for subjectivity we would expect numbers in this range. For example, if a coder had to log the amount of violence in a TV show as  1 through 5 from none to extreme, then one would expect some variation between coders on what they thought was mild or extreme- even if given a guide and training.

Neuendorf, K. A. (2002). The content analysis guidebook. Thousand Oaks, CA: Sage Publications, Inc.

Database Variables

I.         Name – Name of the elite – Text Variable

Names are entered in common format of family name followed by given names, i.e. Kim Jong Il
The order the elites names are recorded in the news story.

II.         Position – If a position (for example, General) is given for the elite – Text Variable

III.         Date – Date of the even ran in the report by Korean Central News Agency (KCNA) – Text Variable

The date is recorded using the MM/DD/YYYY (Gregorian little-endian) format. For example 15 of April 1912 would be recorded as 04/15/1912

IV.         Country – Country reported in source – Categorical Variable

Will be North Korea, China or Russia.

V.         Province – If province is given or can be determined. – Categorical Variable

North Korean Provinces / Special Cities: Pyongyang, Kangwon, North Hamgyong, South Hamgyong, North Hwanghae, South Hwanghae, Ryanggang, Rason, Jagang, North Phyongan, South Phyongan

Chinese Provinces: Beijing, Tianjin, Heilongjiang, Inner Mongolia, Jiangsu and Liaoning

Russian Provinces: Amur Oblast, Buryatia, Khabarovsk Krai, Primorsky Khai, Primorsky Krai,

If it cannot be identified it is listed as “Not Specified”

VI.         County / City – Additional local information is added here if it can be determined from either     the actual report or reliable secondary sources such as North Korea Economy Watch. – Text Variable

VII.         Place – Refers to either the location of the visit or the KPA Unit that was visited. KPA Units are listed as “Unit XX” – Text Variable

VIII.         Description – Brief description of the event, i.e. “Inspected KPA Unit XX” or “Viewed performance of art show” – Text Variable

IX.         Long City – Longitude for the city where the event takes place or nearest city if it takes place in a rural area.  – Numeric Variable

X.         Sector – Sector type – Categorical Variable

What type of sector this event should fall under was based on the following criteria:

A: Arts / Cultural – Involves cultural relics, performances, arts, or anything of that sort with exceptions (see military)
D: Diplomatic – Any meetings or events with foreigners or trips outside of North Korea.
E: Economic – Visits to economic facilities; factories, farms, or anything of that sort. Includes visits to military-run production facilities.
M: Military – Anything with an overt military theme or events that would normally be arts / cultural or economic where there is significant military presence.
P: Political – Any activities that do not fall into other categories but includes certain memorials or events related to Kim Il-sung, as well as events related to the party.
O: Other – Anything that does not fall within the other five categories

XI.         Report – Report number from KCNA – Numeric Variable

The sequential unique report on the KCNA website. The first report of a day would be entered as “1”, the second  as “2” and so on.

XII.         Event – Event number – Numeric Variable
As a report may have multiple events. The sequence is recorded. For example, Elite may look at a factory, a mine and a stadium in a single report, but these would be three separate events.

XIII.         Rank – Order of appearance for elites in the news story (not including Kim Jong Il). – Numeric Variable

For example, if the order of appearance in an article is Jang Song Thaek, Kim Ki Nam, Kim Kyong Hui, etc then the rank would be Jang Song Thaek – 1, Kim Ki Nam – 2, Kim Kyong Hui – 3, etc.

XIV.         Link – Link to source for each report – Text (HTML) Variable

Location Variables are also entered into North Korean Leadership Tracker but this is done  through the uploading to the online tool. These variables include:

Latitude – Latitude of event location – Numeric Variable

Longitude – Longitude of event location – Numeric Variable

Lat City  –  Latitude for the city where the event takes place or nearest city if it takes place in a rural area.  – Numeric Variable

Contact and Feedback

NK News welcomes feedback and enquiries from users experience of using the NK Leadership Tracker and can be reached at:

[email protected]

Biographies of Contributors

Luke Herman
Project Manager 

Mr. Luke Herman is a graduate of the UCSD School of International Relations and Pacific Studies (IR/PS) where he concentrated on international politics with regional focuses on China and Southeast Asia. His undergraduate years were spent at Boston University where he double majored in history and international relations. During the summer of 2011, he interned for the Korea Economic Institute (KEI) and wrote a lengthy report on apperances made by Kim Jong Il and what they said about leadership dynamics within the country. His research interests include the China-North Korea relationship, elite political institutions in China, and the North Korean leadership.

Kevin J Conroy
Security Analyst and Consultant

Kevin has worked in South Korea, Afghanistan, Nigeria and London quantitively and qualitively analyzing security issues; including strategic communication, terrorism, drivers of conflict and insurgency, and economic development. Since graduating with distinction in MSc Security Studies from University College London, he has worked with leading research organizations, such as Transparency International and worked in the risk advisory industry covering political and security risks in Africa and South East Asia.

He is currently enjoying the security and development challenges in Kano, Northern Nigeria as a consultant working with Adam Smith International implementing a DFID project to improve the livelihoods of the population.

Kevin has worked professionally on a number of databases including the International War Mapping database at UCL and Terrorism Tracker for the Risk Advisory Group, and was extremely grateful to bring his experience to the North Korean Leadership Tracker.

Boris Bakhalov
Tableau Developer

Boris has over sixteen years of experience in data analysis across a variety of industries and roles. Prior to moving to Canada, Boris worked as a Head
of local office (Belarus) for IMS Health and lead data collection and analysis from panel of pharmacies and hospitals.  Since 2008 as a professional Tableau Designer, Boris is working on number of projects for retail, transportation and logistics, finance and non-profit.Boris is a professional Tableau developer based in Toronto, Canada. Check his LinkedIn account here. 

Jaesung Ryu
Translation and Analysis 

Jaesung Ryu is a MPIA graduate of the School of International Relations and Pacific Studies (IR/PS) at UCSD. There, he concentrated on international politics and public policy with a regional focus on Korea. He received his B.A. in economics and College of Social Studies from Wesleyan University. As a South Korean national, he completed his military service by serving as a translation officer for a South Korean military intelligence agency, gaining first-hand experience in security policy, intelligence cooperation and counterintelligence activities. His research interest is primarily focused on inter-Korean relations and the political economy of North Korea.