Közgazdaság- és Regionális Tudományi Kutatóközpont Adatbank

ADMIN3 (2003-2017)

ADMIN3 (2003-2017) 

Data host

National Health Insurance Fund Administration, Central Administration of National Pension Insurance, National Tax and Customs Administration, National Labour Office, Educational Authority

The objective of the survey

Tracking workers and employers in a period of 15 years

The reference population

Half of the population aged 0+ in 2003

Data suppliers

Registers of OEP, ONYF, NAV, NMH, and OH

Sample size

ca. 5.174.486 people

Period

2003-2017, monthly

Main topics of data

Demographics (age, gender)

Educational attainment (for those with at least one unemployment spell)

Employment status

Occupation

Wages

Transfer receipt

Firm-level data (sector, size and ownership)

Education

Job seeker information

Main characteristics of the survey

The units of observation are payment records (contribution payments and/or transfer receipts) on the 15th day of a given month by a given person. The observed individuals may have several payment records at a given point in time.

Availability

Available for researchers of the CERS, their co-authors, and students until February 1, 2022. Unconditionally available after this date.

ADMIN 1 (2002-2009)

Data host

National Health Insurance Fund Administration, Central Administration of National Pension Insurance, National Tax and Customs Administration, National Labour Office, Educational Authority

The objective of the survey

The objective of the data is career tracking (even for specific groups).

The reference population

Half of the population aged 5-74 in 2002

Data suppliers

OEP, ONYF, NAV, NMH, OH by administrative data collection

Sample size

ca. 4 602 000 people

Period

2002-2009, monthly

Main topics of data

Demographics (age, gender)

Educational attainment (for those with at least one unemployment spell)

Employment status

Occupation

Wages

Transfer receipt

Main characteristics of the survey

The units of observation are payment records (contribution payments and/or transfer receipts) on the 15th day of a given month by a given person. The observed individuals may have several payment records at a given point in time.

Availability

Unconditionally available for academic research

 TIME USE SURVEY

Data host

Hungarian Central Statistical Office

Objective of the survey

To measure and present the time use (living conditions, lifestyle) of the population living in households

Target population

Population aged between 10-84 living in selected households

Data suppliers

Questionnaire survey; population living in private households in Hungary

Sample size

11 200 addresses, 13 000 time use diary

In case of people aged 10-14 an additional survey was conducted with the caregiver

Period

01.10.2009 – 30.09.2010.

Main fields of data

Detailed data on time use of people in the sample

Demographic characteristics

Age, sex, educational attainment, county, type of settlement, economic activity

Other characteristics, comments

The sample is the discarded or failed address population from the Labour Force Survey (LFS), supplemented by addresses selected from new dwellings to improve coverage.

Weighting was based on sex, age group, type of settlement of the population carried forward

The dataset represents all people only by the first diary recorded. Although, two time time use diaries were recoded, one weekday and one weekend day for those in age groups 15-24 and 25-69 who were in paid employment.

Note: Time use data of 1986/87 and 1999/2000 are available in the Databank of TÁRKI

OPTEN

Data host

The objective of the survey

The reference population

Data suppliers

Sample size

Period

Main topics of data

Main characteristics of the survey

MASS DISMISSALS 1990-1995

Work in progress

DISTRICT LEVEL DATA IN BUDAPEST (B-STAR)

Data host

Hungarian Central Statistical Office

The objective of the survey

The reference population

Data suppliers

Sample size

Period

Main topics of data

Main characteristics of the survey

CONTRIBUTION RETURN DATABASE OF THE NATIONAL TAX AND CUSTOMS ADMINISTRATION

Data host

National Tax and Customs Administration

Objective

The individual-level contribution return database of the National Tax and Customs Administration contains the monthly tax returns of employers obliged to submit tax returns for all taxes and contributions related to payments and benefits paid to individuals.

Reference population

A 10% anonymized random sample of individuals with earned income.

Sample size

Data of five years: a total of 30 million observations on 2 million people.

Period

Monthly data between 01/01/2007 and 31/12/2011

National assessment of basic competencies

The National Assessment of Basic Competencies (NABC) is being held each year in May, in the complete 6th, 8th and 10th school grades. The database contains the results of Mathematical and Reading literacy tests and the responses of the background questionnaires on different levels (institutional, settlement, student).

From 2023 onwards, the number of participants in the measurement has been extended to include students in 7th, 9th and 11th grade. However, only students in public education were involved in the measurement in 9th and 11th grades, and in 10th grade in the field of Science.

Datahost

Educational Authority

Purpose

Public report about the test results for the institutions and maintainers.

Population

6th, 8th, 10th grade students and their institutions in the given year (except those settlements/schools that only educate SEN students). 7th, 9th, 11th grade students and their institutions in the given year (except those settlements/schools that only educate students with special education needs), available from the year 2023.

Data providers

The tests are being processed centrally. The institutional and settlement background questionnaires are filled out by the head of the institution/settlement and student questionnaires are filled out by the student, or their family.

Sample size

Every student of the given grade, except the ones who were missing at the time of the test completion, and those students who declined completing the test. (In 2006 and 2007 the survey was fully complete solely in the 8th grade, in the 6th and 10th grades there was a representative sampling).

 Time period covered

2006-2019, 2021-2023 (National Assessment of Basic Competencies in 2020 was cancelled due to the pandemic)

Levels of the data

Institutional

Settlement

Student

Main groups of data

1. Institutional data

Contains data of the institution’s address, the responses of the school background questionnaire (equipment, features of the school), the average test results and their variances of the 6th, 8th and 10th grades of every settlement of the institution, by the ID number of the institution.

***From 2019 the variables of institutional data appear in the settlement database.

2. Settlement data

Contains data of the settlement’s address, the responses of the school background questionnaire (equipment, features of the settlement), the average test results and their variances of the of the 6th, 7th, 8th, 9th, 10th and 11th grades of the given settlement, by ID numbers of the institution and the settlement.

3. Student data

Contains data by a unique identifier: of the responses of the student background questionnaire (family background: social, cultural, economic), data related test completion (number of declines, missings, filled outs) level and scale point of the mathematical and reading literacy test, average test points and their variances of the given class, grade, settlement of the institution. It contains the ID number of the institution, the code of settlement and the mark of class in the case of every student/observation.

*** From 2018, the Educational Authority assigns the student data with a new, anonym identifier. This variable has been added to the previous databases retroactively, meaning the years between 2008 – 2017. Thus the variable azon_uj reflects the student’s anonym identifier between 2008 – 2023.

Other features, notes

The institutional datasets can be merged with the settlement and student data by using the ID number, while the settlement and student datasets can be merged by the ID number and settlement codes. From 2008, the unique identifiers of the students have not changed through the years, thus the data of each grade (6., 8.,10.) can be linked together on a student level.

The data have been merged to the KIR-STAT data (number of pupils, types of education, identifiers) every year on an institutional level, concerning the given school year.

The completion rate of the student background questionnaires dropped drastically compared to previous years! NABC tests are filled out digitally since the year 2022. Variables of the new measurement areas – science, language tests – can be found on the databases.

Access to the data

The chief of the Databank authorizes the data request in every case, according to the Databank’s policy.

APPLICATION AND ADMISSION TO HIGHER EDUCATION (FELVI)

The FELVI database contains data of the  higher education admissions forms and the available academic programmes between 2001 and 2023. Each year consists of 3 datasets:

Institutional data

Admission data

Individual data

Data host

Office of Education

The objective of the survey

Registering data of applications and admission results

The reference population

All applicants, who submitted an admission form in the given year. (In any of the two rounds of admissions) 

Sample size

All applicants, who submitted an application form in the given year. All programmes  of all institutions which were available for the applicants in any of the two rounds of admissions.

Period

2001-2023, annually

Levels of data

Institutional data

Admission data

Individual data

 

PUBLIC EDUCATION (KIR-STAT)

The KIR-STAT database contains yearly statistical data on institutions performing public education functions at different levels

Data host

Educational Authority

The objective of the survey

Statistical data of institutions in public education

Target population

All public education institution in Hungary

Data suppliers

Institutions registered through the KIR data system operated by the Educational Authority

Sample size

All public education institutions and their settlements

Period

2001-2023, by year (deadline for reporting by the institutions is 15 October of given year)

Level of data

Institutional

Settlement

Curriculum

Optional:

Vocational training

Remedial education

Main fields of data

Main data of the seat institution and settlements

Facilities, classrooms

Teachers, other school staff

Classes

Number of students in different breakdowns

Private students

Language education

Students with successful vocational exams and matriculation exams

Early development, developmental preparation

Data of nurseries, kindergartens

Enrolment of compulsory school aged students, opening enrolment data

Students with special needs (SEN)

Main characteristics, comments

The KIR-STAT database is structured hierarchically; it contains separate data sheets at the level of institutions, settlements, and separate at the level of programmes running in public education institutions. In addition, separate data sheets are filled in with the institutions providing vocational training, remedial education, speech therapy and language guidance, expert and rehabilitation services and elementary art education.

ROMA SURVEY 1971 

In 1971, István Kemény initiated and led a representative survey to the situation of Roma (Gypsy) population of Hungary. The questionaires and part of the digitalised data have been destroyed/lost. This file has been compiled using three incomplete data sets that survived until 2012.

The objective of the survey

Measuring demographic characteristics, living conditions, nutrition, education and employment, income and wealth of the Roma population  

The reference population

Roma population living in selected settlements of Hungary. Stratified sampling using prior information on Roma density across and within settlements.

Data suppliers

2 percent of the Roma households interviewed using questionnaires. The adult respondents were asked to provide information on their children and relatives.

Sample size

The final sample contains 2912 people older than 14, who belong to 1056 households. (The original sample contained data of 3510 people older than 14.)

Period

1971

Main topics of data

Main groups of the variables:

education, languages, literacy

Occupation, employment, wages and other benefits

Social transfers, pensions, job search

Alternative sources of income (collecting, etc.)

Marital status, spouses and children

State of health, nutrition

Incarceration experience

Housing, durable goods, livestock

Expenditures and revenues (selected items)

Household characteristics

Availability

Unconditionally available for academic research

FINANCIAL STATEMENT OF FIRMS IN THE WAGE SURVEY

Data host

National Tax and Customs Administration, Public Employment Service

The objective of the survey

Broadening the scope for high-quality research by merging individual and firm-level data.

The reference population

Full-time employees and since 2002 part-time employees of economic organizations in Hungary with legal entities with at least 4 employees. In  2002, 2005, 2007 and 2008 non-profit organzations are also surveyed.

Data suppliers

See the entries on the Wage Survey and Corporate Financial Statements

Sample size

9-11 thousand enterprises and 125-185 thousand employees

Period

2000-2011, annually

Main topics of data

Demographics

Education, labour market experience

Wages, occupation

Headcount, ownership, sectorial affiliation, sales revenues, material cost, fixed assets, profits

CORPORATE FINANCIAL STATEMENTS

Data host

National Tax and Customs Administration

The objective of the survey

Tax collection

The reference population

Enterprises using double-entry bookkeeping system under the Act LXXXI. of 1996 on corporate and dividend taxes

Data suppliers

All enterprises belongs to the above category

Sample size

150-400 thousand enterprises

Period

2000-2022, annually

Main topics of data

Region, headcounts and sector of the enterprise

Selected data from the firms financial statement

CENSUS TRACTS AND THEIR TRANSPORT CONNECTIONS (GEO)

The GEO database consists all of the Hungarian census districts (45 555 unit) and assign all of the education institutions, the health institution and the company workplaces regarding these districts. Furthermore the database consists the distances between the districts and the necessary time and cost to reach one from the another. The producers of the database are the Hungarian Central Statistical Office, the Central European University, the Centre for Economic and Regional Studies of the Hungarian Academy of Sciences (CERS), the GEOX Ltd., Terra-Laky Ltd., and the AntaresNav Ltd. The support institution was the Hungarian Academy of Sciences. The database was created from 2013 to 2015.

Data host

CERS, CEU

The objective of the survey

The database creating connections between the Hungarian census districts and the relevant economic, education amd health related indicators together with the criterion of the distance between these districts and the environment of the districts.

Distributors of the database

HCSO

CEU

Geox Ltd.

Terra-Laky Ltd.

AntaresNav Ltd.

Sample size

45 555 census unit

Period

the base of the dataset is the districts of the Hungarian Census 2011

public transport data: 2014

private car data: 2015

company seats/workplace data 2014

education, health care, social institutions: 2015

Other features, comments

The database was divided into parts because of the large number of the variables and cases. The researchers can work with these parts regarding the goals of the research.

Main topics of data

Static description of the census districts

Variables regarding the districts, duration time and costs between two district

Administration data about the institutions (addresses of the nursery and pre-school, public education and tertiary education institution, health  and social institutions

Data about the company workplaces: the database build connections between the census districts and the tax identifier number of the companies. (By the help of this the statement of accounts will be able to merge to the database)

Postal codes of the districts

ADMIN 2 (2003-2011) 

Data host

National Health Insurance Fund Administration, Central Administration of National Pension Insurance, National Tax and Customs Administration, National Labour Office, Educational Authority

The objective of the survey

The objective of the data is career tracking (even for specific groups).

The reference population

Half of the population aged 5-74 in 2003

Data suppliers

OEP, ONYF, NAV, NMH, OH by administrative data collection

Sample size

ca. 4 602 000 people

Period

2003-2011, monthly

Main topics of data

Demographics (age, gender)

Educational attainment (for those with at least one unemployment spell)

Employment status

Occupation

Wages

Transfer receipt

Main characteristics of the survey

The units of observation are payment records (contribution payments and/or transfer receipts) on the 15th day of a given month by a given person. The observed individuals may have several payment records at a given point in time.

Availability

Available on a contractual basis

ACCESSIBILITY WITH REGIONAL BUS LINES (VOLAN)

Data host

CData Limited Partnership

The objective of the survey

The objective of the data is analysing the spatial links between settlements and modelling their availability by public transport.

The reference population

The set of bus lines operated by Volan companies

Data suppliers

Published timetables of the Volan companies

Sample size

434 958 bus lines, which trip between 3109 place of departure and 3100 place of arrival

Period

The dataset is referring to the published timetable valid on 16th February 2006.

Main topics of data

Departure and arrival times, places of departure and arrival, characteristics of the buses in a given half-hour interval

Characteristics of the settlements

Main characteristics of the survey

The survey contains data on buses which travel between 5.30 and 9.00 a.m., if their travel time is less than 90 minutes.

UNEMPLOYMENT REGISTER – MUNICIPALITY LEVEL TIME SERIES

Data host

National Employment Office

The objective of the survey

The dataset is useful for calculating the unemployment rate on regional level.

The reference population

People who registered as unemployed in regional labour offices.

Data suppliers

Labour offices

Sample size

The settlements of the Hungarian administrative system, ca. 3150.

Period

from 1990 to 1997 the data are from March, June, September, December

1998 – 2024 monthly data

Main topics of data

Demographics, Education, Services, Length of registration

Reason of the exits and enter on the labour market

Household Budget Survey

Data host

Hungarian Central Statistical Office

The objective of the survey

The survey is detecting the monetary and non monetary incomes and expenditures of the population.

The reference population

Hungarian citizens living in private households in Hungary.

Data suppliers

The population of Hungary living in private households, surveyed by questionnaires.

Sample size

7.5-10 thousand households, 20-26 thousand individuals

Period

1993-2012, annually

Main topics of data

The expenditures and revenues of the households

Demographics

Economic activity and housing conditions

Durable consumer goods

Main characteristics of the survey

The sample is drawn using multi-stage stratified sample design, and it is independent of the size of settlement.

A households is in the sample for 3 years, third of the whole sample is replaced annually.

The weighting method is generalized iterative scaling.

 

REGIONAL STATISTICS (T-STAR)

Data host

Hungarian Central Statistical Office

The objective of the survey

The survey yields information on the universe of Hungary’s settlements, with data on their demographics, educational composition, unemployment and various indicators of the local economy.

The reference population

Administratively independent settlements (municipalities)

Data suppliers

The data are collected by the Central Statistical Office mainly from the local governments, but also from ministries and governmental institutions.

Sample size

3 164 settlements, which existed for at least one day since 1st January 1990

Period

1990-2022, annually

Main topics of data

Accidents

Demographics

Health

Economic Entities

Justice

Industry

Trade, hospitality

Transportation, communication

Culture, public education

Government

Communal Infrastructure

Pollution

House stock

Agriculture

Unemployment

Education

Municipal Aid

Municipal budget

Social services

Tourism

Income taxes

Employes

Labour Force Survey

Data host

Hungarian Central Statistical Office

The objective of the survey

The survey provides information on the economic activity of the population, following Eurostat guidelines. Rotating panel, with each cohort staying in the sample for six quarters.

The reference population

People at age 15-74, living in private households. The sampling units are dwellings.

Data suppliers

Residents in Hungary living in households

Sample size

The number of observed households varied between 22,000 and 34,000 in 1992-2012. The sample contains 45,000-70,000 people aged 15-74 and between 15,000 and 20,000 people outside this age range.

Period

1992-2024. IV. , quarterly

Main topics of data

Demographics, Education

Labour market status

Transfer status

Household characteristics

Residence characteristics

Main characteristics of the survey

The sample is drawn using multi-stage stratified sample design.

Weighs ensure that the sample is representative.

Availability

Conditional upon approvement by the Central Statistical Office