ADMIN3 (2003-2017)

Data host

National Health Insurance Fund Administration, Central Administration of National Pension Insurance, National Tax and Customs Administration, National Labour Office, Educational Authority

The objective of the survey

Tracking workers and employers in a period of 15 years

The reference population

Half of the population aged 0+ in 2003

Data suppliers

Registers of OEP, ONYF, NAV, NMH, and OH

Sample size

ca. 5.174.486 people

Period

2003-2017, monthly

Main topics of data

Demographics (age, gender)

Educational attainment (for those with at least one unemployment spell)

Employment status

Occupation

Wages

Transfer receipt

Firm-level data (sector, size and ownership)

Education

Job seeker information

Main characteristics of the survey

The units of observation are payment records (contribution payments and/or transfer receipts) on the 15^th day of a given month by a given person. The observed individuals may have several payment records at a given point in time.

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form.

Safeguards for data

Longitudinal data on educational careers

János Köllő: The benefits of linked administrative panel data an example of the KRTK Databank Admin-databases

Anna Sebők: The KRTK’s linked administrative panel dataset – Admin4

ADMIN 1 (2002-2009)

Data host

National Health Insurance Fund Administration, Central Administration of National Pension Insurance, National Tax and Customs Administration, National Labour Office, Educational Authority

The objective of the survey

The objective of the data is career tracking (even for specific groups).

The reference population

Half of the population aged 5-74 in 2002

Data suppliers

OEP, ONYF, NAV, NMH, OH by administrative data collection

Sample size

ca. 4 602 000 people

Period

2002-2009, monthly

Main topics of data

Demographics (age, gender)

Educational attainment (for those with at least one unemployment spell)

Employment status

Occupation

Wages

Transfer receipt

Main characteristics of the survey

The units of observation are payment records (contribution payments and/or transfer receipts) on the 15^th day of a given month by a given person. The observed individuals may have several payment records at a given point in time.

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form.

TIME USE SURVEY

Data host

Hungarian Central Statistical Office

Objective of the survey

To measure and present the time use (living conditions, lifestyle) of the population living in households

Target population

Population aged between 10-84 living in selected households

Data suppliers

Questionnaire survey; population living in private households in Hungary

Sample size

11 200 addresses, 13 000 time use diary

In case of people aged 10-14 an additional survey was conducted with the caregiver

Period

01.10.2009 – 30.09.2010.

Main fields of data

Detailed data on time use of people in the sample

Demographic characteristics

Age, sex, educational attainment, county, type of settlement, economic activity

Other characteristics, comments

The sample is the discarded or failed address population from the Labour Force Survey (LFS), supplemented by addresses selected from new dwellings to improve coverage.

Weighting was based on sex, age group, type of settlement of the population carried forward

The dataset represents all people only by the first diary recorded. Although, two time time use diaries were recoded, one weekday and one weekend day for those in age groups 15-24 and 25-69 who were in paid employment.

Note: Time use data of 1986/87 and 1999/2000 are available in the Databank of TÁRKI

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form, solely for researchers or research assistants employed by KRTK. Available for external researchers of the KRTK, researchers participating in joint research with researchers employed by the KRTK (co-authors), PhD students or thesis writers of researchers employed by the KRTK.

OPTEN

Data host

The objective of the survey

The reference population

Data suppliers

Sample size

Period

Main topics of data

Main characteristics of the survey

Valaminek a letöltése

MASS LAYOFFS 1990-1995

The file consists of data derived from press information. The events were created from a database in which layoff news was recorded between 1990-1995.

Data host

HUN-REN Centre of Economic and Regional Studies

Data providers

The file was created from layoff news from the press

Sample size

The file contains a total of 3126 layoff events of 1511 companies.

Period

From 1990 to 1995

Other characteristics, comments

The observation unit in the database is a layoff event. The events were created from a database in which layoff news was recorded between 1990-1995. In cases where the same event appeared in multiple news, it appears as only one event in the file.

Data availability

The database can be requested for a designated research purpose on a server or by data release, by submitting a data request form.

DISTRICT LEVEL DATA IN BUDAPEST (B-STAR)

Data host

Hungarian Central Statistical Office

The objective of the survey

The reference population

Data suppliers

Sample size

Period

Main topics of data

Main characteristics of the survey

Valaminek a letöltése

CONTRIBUTION RETURN DATABASE OF THE NATIONAL TAX AND CUSTOMS ADMINISTRATION

Data host

National Tax and Customs Administration

Objective

The individual-level contribution return database of the National Tax and Customs Administration contains the monthly tax returns of employers obliged to submit tax returns for all taxes and contributions related to payments and benefits paid to individuals.

Reference population

A 10% anonymized random sample of individuals with earned income.

Sample size

Data of five years: a total of 30 million observations on 2 million people.

Period

Monthly data between 01/01/2007 and 31/12/2011

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form, solely for researchers or research assistants employed by KRTK. Available for external researchers of the KRTK, researchers participating in joint research with researchers employed by the KRTK (co-authors), PhD students or thesis writers of researchers employed by the KRTK.

National assessment of basic competencies

The National Assessment of Basic Competencies (NABC) was being held each year in May, in the complete 6^th, 8^th and 10^th school grades until 2021. From 2022, the measurements are carried out digitally. There are new tests and new grades participating. The databases contain the results of the tests and the responses of the background questionnaires on student and settlement levels.

Among students in vocational training according to the Vocational Education Act (vocational training school, technical school), only 10th graders were required to participate in the mathematics and reading comprehension tests. Students in vocational training according to the Vocational Education Act, do not participate in the natural science, history, digital culture, foreign language, target language measurements. An exception to this is students participating in vocational training under the Public Education Act.

Grades participating in the measurement and pilot measurements:

4th grade: reading comprehension, mathematics
5th grade: reading comprehension, mathematics, digital culture, history pilot measurement
6th-11th grades: reading comprehension, mathematics, natural science, foreign language measurement among those learning English or German as a first foreign language, and target language measurement in English, German, Chinese. Digital culture, history pilot measurement.

Datahost

Educational Authority

Purpose

Public report about the test results for the institutions and maintainers

Population

4^th, 5^th, 6^th 7^th, 8^th, 9^th, 10^th, 11^th grade students and their institutions in the given year. Except those settlements/schools that only educate students with special education needs.

Data providers

The tests are being processed centrally. The institutional and settlement background questionnaires are filled out by the head of the institution/settlement. The student questionnaires are filled out by the student or their family.

Sample size

Every student of the given grade, except the ones who were missing at the time of the test completion, and those students who declined completing the test. (In 2006 and 2007 the survey was fully complete solely in the 8^th grade, in the 6^th and 10^th grades there was a representative sampling).

Pursuant to Section 181A § of Government Decree 12/2020. (II. 7.) on the implementation of the Vocational Education Act, vocational training institutions only participate in the mathematics and reading comprehension test in the 10th grade. Students participating in vocational training falling under the scope of the Public Education Act are an exception to this. Thus, in grades 9-11, the comparison groups called technical schools and vocational schools only include institutions falling under the scope of the Public Education Act. Due to the differences in the range of participants, the national average results in grades 9-11 – with the exception of reading comprehension and mathematics in grade 10 – cannot be compared with the national average results in other grades.

Time period covered

2006-2019, 2021-2024 (there was no survey in 2020 due to the Covid-19 pandemic)

Levels of the data

Institutional
Settlement
Student

Main groups of data

1. Institutional data

Contains data of the institution’s address, the responses of the school background questionnaire (equipment, features of the school), the average test results and their variances of the 6^th, 8^th and 10^th grades of every settlement of the institution, by the ID number of the institution.

2. Settlement data

Contains data of the settlement’s address, the responses of the school background questionnaire (equipment, features of the settlement), the average test results and their variances of the 4^th, 5^th, 6^th, 7^th, 8^th, 9^th, 10^th and 11^th grades of the given settlement, by ID numbers of the institution and the settlement.

3. Student data

Contains data by a unique identifier: of the responses of the student background questionnaire (family background: social, cultural, economic), data related test completion (number of declines, missings, filled outs) level and scale point of the mathematical and reading literacy test, average test points and their variances of the given class, grade, settlement of the institution. It contains the ID number of the institution, the code of settlement and the mark of class in the case of every student/observation.

From 2018, the Educational Authority assigns the student data with a new, anonym identifier. This variable has been added to the previous databases retroactively, meaning the years between 2008 – 2017. Thus the variable azon_uj reflects the student’s anonym identifier between 2008 – 2024.

Other features, notes

The institutional datasets can be merged with the settlement and student data by using the ID number, while the settlement and student datasets can be merged by the ID number and settlement codes. From 2008, the unique identifiers of the students have not changed through the years, thus the data of each grade (6., 8.,10.) can be linked together on a student level.

The data have been merged to the KIR-STAT data (number of pupils, types of education, identifiers) every year on an institutional level, concerning the given school year.

State of the databases from 2021 are on a temporary stage as linking the KIRSTAT data to instituions (and their settlements) caused a difficulty due to the transformation of the vocational training sector. Variables merged from Kirstat can be found in the NABC databases although many of the cases have missing values (where could not merge) and it is also important to keep in mind: where cases are not missing (successful merge) there is still an uncertainty if the merge was correct by the variables omid and telephely.

NABC tests are filled out digitally since the year 2022. The completion rate of the student background questionnaires dropped drastically compared to previous years!

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form, solely for researchers or research assistants employed by KRTK. Available for external researchers of the KRTK, researchers participating in joint research with researchers employed by the KRTK (co-authors), PhD students or thesis writers of researchers employed by the KRTK.

Variable catalog

APPLICATION AND ADMISSION TO HIGHER EDUCATION (FELVI)

The FELVI database contains data of the higher education admissions forms and the available academic programmes between 2001 and 2024. Each year consists of 3 datasets:

Institutional data
Admission data
Individual data

Data host

Office of Education

The objective of the survey

Registering data of applications and admission results

The reference population

All applicants, who submitted an admission form in the given year.

Sample size

All applicants, who submitted an application form in the given year. All programmes of all institutions which were available for the applicants in any of the two rounds of admissions (normal, cross-semester, additional admission procedure).

Period

2001-2025, annually

Levels of data

Institutional data
Admission data
Individual data

Main data groups by level

1. Institutional data

Contains data for each field of study by admission procedure, type of funding, form of education (daytime, evening, distance, correspondence course) and level of training for all higher education institutions.

2. Application data

The application forms contain the application data in long format, i.e. the data is not listed per applicant but per application (per line), indicating which higher education place the applicant was ultimately admitted to.

3. Individual data

The application form contains the following information for each applicant: secondary school data, personal data, academic and secondary school leaving exam results, previous higher education studies, additional points, admission score. Changes have been made to the higher education admission system since 2024, which affects the variables related to points in the 2024 data.

Other characteristics, comments

Institutional and application data can be linked within each year based on the variables eljaras karkod karkod_regi szaknev szint munkarend fin_form. Application and individual data can be linked within each year based on the variables eljaras id.

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form, solely for researchers or research assistants employed by KRTK. Available for external researchers of the KRTK, researchers participating in joint research with researchers employed by the KRTK (co-authors), PhD students or thesis writers of researchers employed by the KRTK.

FELVI guidemap

Changes to Felvi databases in 2018-2025

PUBLIC EDUCATION (KIR-STAT)

The KIR-STAT database contains yearly statistical data on institutions performing public education functions at different levels

Data host

Educational Authority

The objective of the survey

Statistical data of institutions in public education

Target population

All public education institution in Hungary

Data suppliers

Institutions registered through the KIR data system operated by the Educational Authority

Sample size

All public education institutions and their settlements

Period

2001-2024, by year (deadline for reporting by the institutions is 15 October of given year)

Level of data

Institutional
Settlement
Curriculum

Optional:

Vocational training
Remedial education

Main fields of data

Main data of the seat institution and settlements
Facilities, classrooms
Teachers, other school staff
Classes
Number of students in different breakdowns
Private students
Language education
Students with successful vocational exams and matriculation exams
Early development, developmental preparation
Data of nurseries, kindergartens
Enrolment of compulsory school aged students, opening enrolment data
Students with special needs (SEN)

Main characteristics, comments

The KIR-STAT database is structured hierarchically; it contains separate data sheets at the level of institutions, settlements, and separate at the level of programmes running in public education institutions. In addition, separate data sheets are filled in with the institutions providing vocational training, remedial education, speech therapy and language guidance, expert and rehabilitation services and elementary art education.

Availability

The database can be requested for a designated research purpose on a server or by data release, by submitting a data request form.

ROMA SURVEY 1971

In 1971, István Kemény initiated and led a representative survey to the situation of Roma (Gypsy) population of Hungary. The questionaires and part of the digitalised data have been destroyed/lost. This file has been compiled using three incomplete data sets that survived until 2012.

The objective of the survey

Measuring demographic characteristics, living conditions, nutrition, education and employment, income and wealth of the Roma population

The reference population

Roma population living in selected settlements of Hungary. Stratified sampling using prior information on Roma density across and within settlements.

Data suppliers

2 percent of the Roma households interviewed using questionnaires. The adult respondents were asked to provide information on their children and relatives.

Sample size

The final sample contains 2912 people older than 14, who belong to 1056 households. (The original sample contained data of 3510 people older than 14.)

Period

1971

Main topics of data

Main groups of the variables:

education, languages, literacy

Occupation, employment, wages and other benefits

Social transfers, pensions, job search

Alternative sources of income (collecting, etc.)

Marital status, spouses and children

State of health, nutrition

Incarceration experience

Housing, durable goods, livestock

Expenditures and revenues (selected items)

Household characteristics

Availability

The database can be requested for a designated research purpose on a server or by data release, by submitting a data request form.

FINANCIAL STATEMENT OF FIRMS IN THE WAGE SURVEY

Data host

National Tax and Customs Administration, Public Employment Service

The objective of the survey

Broadening the scope for high-quality research by merging individual and firm-level data.

The reference population

Full-time employees and since 2002 part-time employees of economic organizations in Hungary with legal entities with at least 4 employees. In 2002, 2005, 2007 and 2008 non-profit organzations are also surveyed.

Data suppliers

See the entries on the Wage Survey and Corporate Financial Statements

Sample size

9-11 thousand enterprises and 125-185 thousand employees

Period

2000-2011, annually

Main topics of data

Demographics

Education, labour market experience

Wages, occupation

Headcount, ownership, sectorial affiliation, sales revenues, material cost, fixed assets, profits

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form.

CORPORATE FINANCIAL STATEMENTS

Data host

National Tax and Customs Administration

The objective of the survey

Tax collection

The reference population

Enterprises using double-entry bookkeeping system under the Act LXXXI. of 1996 on corporate and dividend taxes

Data suppliers

All enterprises belongs to the above category

Sample size

150-400 thousand enterprises

Period

2000-2024, annually

Main topics of data

Region, headcounts and sector of the enterprise

Selected data from the firms financial statement

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form, solely for researchers or research assistants employed by KRTK. Available for external researchers of the KRTK, researchers participating in joint research with researchers employed by the KRTK (co-authors), PhD students or thesis writers of researchers employed by the KRTK.

CENSUS TRACTS AND THEIR TRANSPORT CONNECTIONS (GEO)

The GEO database consists all of the Hungarian census districts (45 555 unit) and assign all of the education institutions, the health institution and the company workplaces regarding these districts. Furthermore the database consists the distances between the districts and the necessary time and cost to reach one from the another. The producers of the database are the Hungarian Central Statistical Office, the Central European University, the Centre for Economic and Regional Studies of the Hungarian Academy of Sciences (CERS), the GEOX Ltd., Terra-Laky Ltd., and the AntaresNav Ltd. The support institution was the Hungarian Academy of Sciences. The database was created from 2013 to 2015.

Data host

CERS, CEU

The objective of the survey

The database creating connections between the Hungarian census districts and the relevant economic, education amd health related indicators together with the criterion of the distance between these districts and the environment of the districts.

Distributors of the database

HCSO

CEU

Geox Ltd.

Terra-Laky Ltd.

AntaresNav Ltd.

Sample size

45 555 census unit

Period

the base of the dataset is the districts of the Hungarian Census 2011

public transport data: 2014

private car data: 2015

company seats/workplace data 2014

education, health care, social institutions: 2015

Other features, comments

The database was divided into parts because of the large number of the variables and cases. The researchers can work with these parts regarding the goals of the research.

Main topics of data

Static description of the census districts

Variables regarding the districts, duration time and costs between two district

Administration data about the institutions (addresses of the nursery and pre-school, public education and tertiary education institution, health and social institutions

Data about the company workplaces: the database build connections between the census districts and the tax identifier number of the companies. (By the help of this the statement of accounts will be able to merge to the database)

Postal codes of the districts

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form.

ADMIN 2 (2003-2011)

Data host

National Health Insurance Fund Administration, Central Administration of National Pension Insurance, National Tax and Customs Administration, National Labour Office, Educational Authority

The objective of the survey

The objective of the data is career tracking (even for specific groups).

The reference population

Half of the population aged 5-74 in 2003

Data suppliers

OEP, ONYF, NAV, NMH, OH by administrative data collection

Sample size

ca. 4 602 000 people

Period

2003-2011, monthly

Main topics of data

Demographics (age, gender)

Educational attainment (for those with at least one unemployment spell)

Employment status

Occupation

Wages

Transfer receipt

Main characteristics of the survey

The units of observation are payment records (contribution payments and/or transfer receipts) on the 15^th day of a given month by a given person. The observed individuals may have several payment records at a given point in time.

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form.

ADMIN2 guide

ACCESSIBILITY WITH REGIONAL BUS LINES (VOLAN)

Data host

CData Limited Partnership

The objective of the survey

The objective of the data is analysing the spatial links between settlements and modelling their availability by public transport.

The reference population

The set of bus lines operated by Volan companies

Data suppliers

Published timetables of the Volan companies

Sample size

434 958 bus lines, which trip between 3109 place of departure and 3100 place of arrival

Period

The dataset is referring to the published timetable valid on 16^th February 2006.

Main topics of data

Departure and arrival times, places of departure and arrival, characteristics of the buses in a given half-hour interval

Characteristics of the settlements

Main characteristics of the survey

The survey contains data on buses which travel between 5.30 and 9.00 a.m., if their travel time is less than 90 minutes.

UNEMPLOYMENT REGISTER – MUNICIPALITY LEVEL TIME SERIES

Data host

National Employment Office

The objective of the survey

The dataset is useful for calculating the unemployment rate on regional level.

The reference population

People who registered as unemployed in regional labour offices.

Data suppliers

Labour offices

Sample size

The settlements of the Hungarian administrative system, ca. 3150.

Period

from 1990 to 1997 the data are from March, June, September, December

1998 – 2024 monthly data

Main topics of data

Demographics, Education, Services, Length of registration

Reason of the exits and enter on the labour market

Availability

The database can be requested for a designated research purpose on a server or by data release, by submitting a data request form.

Household Budget Survey

Data host

Hungarian Central Statistical Office

The objective of the survey

The survey is detecting the monetary and non monetary incomes and expenditures of the population.

The reference population

Hungarian citizens living in private households in Hungary.

Data suppliers

The population of Hungary living in private households, surveyed by questionnaires.

Sample size

7.5-10 thousand households, 20-26 thousand individuals

Period

1993-2012, annually

Main topics of data

The expenditures and revenues of the households

Demographics

Economic activity and housing conditions

Durable consumer goods

Main characteristics of the survey

The sample is drawn using multi-stage stratified sample design, and it is independent of the size of settlement.

A households is in the sample for 3 years, third of the whole sample is replaced annually.

The weighting method is generalized iterative scaling.

Availability

The database can be requested for a designated research purpose on a server by submitting a data request form, solely for researchers or research assistants employed by KRTK. Available for external researchers of the KRTK, researchers participating in joint research with researchers employed by the KRTK (co-authors), PhD students or thesis writers of researchers employed by the KRTK.

HBS guide

REGIONAL STATISTICS (T-STAR)

Data host

Hungarian Central Statistical Office

The objective of the survey

The survey yields information on the universe of Hungary’s settlements, with data on their demographics, educational composition, unemployment and various indicators of the local economy.

The reference population

Administratively independent settlements (municipalities)

Data suppliers

The data are collected by the Central Statistical Office mainly from the local governments, but also from ministries and governmental institutions.

Sample size

3 164 settlements, which existed for at least one day since 1^st January 1990

Period

1990-2024, annually

Main topics of data

Accidents

Demographics

Health

Economic Entities

Justice

Industry

Trade, hospitality

Transportation, communication

Culture, public education

Government

Communal Infrastructure

Pollution

House stock

Agriculture

Unemployment

Education

Municipal Aid

Municipal budget

Social services

Tourism

Income taxes

Employes

Availability

The database can be requested for a designated research purpose on a server or by data release, by submitting a data request form.

Labour Force Survey

Data host

Hungarian Central Statistical Office

The objective of the survey

The survey provides information on the economic activity of the population, following Eurostat guidelines. Rotating panel, with each cohort staying in the sample for six quarters.

The reference population

People at age 15-74, living in private households. The sampling units are dwellings.

Data suppliers

Residents in Hungary living in households

Sample size

The number of observed households varied between 22,000 and 34,000 in 1992-2012. The sample contains 45,000-70,000 people aged 15-74 and between 15,000 and 20,000 people outside this age range.

Period

1992-2025. III. , quarterly

Main topics of data

Demographics, Education

Labour market status

Transfer status

Household characteristics

Residence characteristics

Main characteristics of the survey

The sample is drawn using multi-stage stratified sample design.

Weighs ensure that the sample is representative.

Availability

From the second quarter of 2014, according to the KSH-KRTK agreement, by filling out an application form, after approval by the KSH, exclusively for projects affiliated with the KRTK on the server.

Lfs guide