Methodology and motivation for Numbeo.com
Before Numbeo was created there were several other reports about cost of living indices, i.e. reports from Mercer, UBS and Economist.
However, in those reports the data behind the research was usually hidden or expensive to purchase. There were no guarantees that their data are correct.
Their research was very limited in the number of cities included in the research and is difficult to scale without significant increase in expenses.
Also, there were no insight about the error rate in their manually collected data.
Manual collection of cost of living data is error prone:
- there is a different price during the year - price oscillation (i.e. cheaper fruits and vegetables during the summer; or high fluctuation of potato price because of lack of storage and high moisture)
- in different supermarkets, bars and restaurants prices of items are usually different
- there are different types of milk, cheese, etc. with different prices even in the same supermarket
- the country could face temporary shortages of a given item which could drive the price temporary up (i.e. rice shortages)
- if only one person collects the price, possibility of human error is higher
Some of those reports publishes just an index, which is not enough for
a personal estimate since a person is not an average
person due to different lifestyles such as:
- the size of a family (number of dependent persons)
- dining out or eating at home
- renting or owning an apartment
- driving or using a public transport
- drinking alcoholic drinks and smoking or not
Other available cost of living sources didn't provide a systematic way to extract custom indices. Numbeo provides a world-class software
for extracting various economic indicators for free (i.e. using our "Basket of goods and services" tool).
Before the Great Recession (World Economic Crysis of 2007-2009) price of properties worldwide tended to look like a crazy to
the founder of this website. The price of a small flat in a third world country he currently lives in was same as 310 ultra
modern TFT monitors at that time. The wild speculation in property prices suggested that people really needed a tool for a
speculation or to turn their speculation down.
So, that's how Numbeo was born. Numbeo :
- provides to a reader of a website prices for free
- allows a person to estimate their own expenses
- uses the wisdom of the crowd to get as reliable data as possible
- provides a system for systematic research of cost of living and property markets
- provides a system for other systematic economical research on huge dataset with worldwide data
Collecting and processing data
To collect data Numbeo relies on user inputs and manually collected data from authoritative sources
(websites of supermarkets, taxi company websites, governmental institutions, newspaper articles, other surveys, etc.). There are automatic and
semi-automatic filters to filter out noise data. The simplest filter is working as follows : if, for a particular price in a city,
values are 5, 6, 20 and 4 in a reasonable time span, the value 20 is discarded as a noise.
Afterwards, ¼ (one quarter) of lowest and highest inputs are discarded as borderline cases. Out of remaining entries, mean value is calculated
and lowest and highest number is displayed.
There are more sophisticated filters in use. The filters are performing better when there are more inputs. One of those is to prevent bad training of data,
it digs into discarded data (spam data) and if notices irregularities, it moves them back into the calculation.
To put it briefly, Numbeo uses heuristic technology. Using the existing data Numbeo periodically discards data which most likely are
incorrect statistically. Numbeo also archives the values of old data (our default data deprecation policy is 12 months, although we use data up to 18 months old
when we don't have fresh data and indicators suggest that inflation is low). The values of old data are preserved
to be used for historical purposes. Due to higher number of inputs for a country than for a city, data showed on a country level in general contains lower
noise than data showed on a city level.
The definition of indices used in cost of living section of the website is available here
We do use multiple currency feeds including European Central Bank feed to update our internal currency exchange rates almost every hour.
We save in our database EUR, USD and local currency values on the day they are entered. When calculating averages, we do reuse one of those entries based on currency
stability history and predominant currency in the country to try to minimize cross currency comparison errors.
Our data about prices shall have GST and VAT included. Our average salary data shall contain the value after income taxes. So we can use these data directly
to estimate local purchases power.
Numbeo indices are a best guess
of relative average expenses in a given city. Weights are subject of change over time.
But since methodology is not hidden, as the moment of writing these weights are as follows :
mysql> select name, category, cpi_factor, rent_factor from item order by category, relative_id;
| name | category | cpi_factor | rent_factor |
| Price per Square Meter to Buy Apartment in City Centre | Buy Apartment Price | 0 | 0 |
| Price per Square Meter to Buy Apartment Outside of Centre | Buy Apartment Price | 0 | 0 |
| 1 Pair of Jeans (Levis 501 Or Similar) | Clothing And Shoes | 0.35 | 0 |
| 1 Summer Dress in a Chain Store (Zara, H&M, ...) | Clothing And Shoes | 0.35 | 0 |
| 1 Pair of Nike Shoes | Clothing And Shoes | 0.35 | 0 |
| 1 Pair of Men Leather Shoes | Clothing And Shoes | 0.35 | 0 |
| Milk (regular), (1 liter) | Markets | 25 | 0 |
| Loaf of Fresh White Bread (500g) | Markets | 31 | 0 |
| Rice (white), (1kg) | Markets | 16 | 0 |
| Eggs (12) | Markets | 28 | 0 |
| Local Cheese (1kg) | Markets | 12 | 0 |
| Chicken Breasts (Boneless, Skinless), (1kg) | Markets | 22.5 | 0 |
| Apples (1kg) | Markets | 36 | 0 |
| Oranges (1kg) | Markets | 36 | 0 |
| Tomato (1kg) | Markets | 24 | 0 |
| Potato (1kg) | Markets | 36 | 0 |
| Lettuce (1 head) | Markets | 16.5 | 0 |
| Water (1.5 liter bottle) | Markets | 30 | 0 |
| Bottle of Wine (Mid-Range) | Markets | 4 | 0 |
| Domestic Beer (0.5 liter bottle) | Markets | 6 | 0 |
| Imported Beer (0.33 liter bottle) | Markets | 6 | 0 |
| Pack of Cigarettes (Marlboro) | Markets | 15 | 0 |
| Apartment (1 bedroom) in City Centre | Rent Per Month | 0 | 0.25 |
| Apartment (1 bedroom) Outside of Centre | Rent Per Month | 0 | 0.25 |
| Apartment (3 bedrooms) in City Centre | Rent Per Month | 0 | 0.25 |
| Apartment (3 bedrooms) Outside of Centre | Rent Per Month | 0 | 0.25 |
| Meal, Inexpensive Restaurant | Restaurants | 16 | 0 |
| Meal for 2, Mid-range Restaurant, Three-course | Restaurants | 3.5 | 0 |
| McMeal at McDonalds (or Equivalent) | Restaurants | 6 | 0 |
| Domestic Beer (0.5 liter draught) | Restaurants | 5 | 0 |
| Imported Beer (0.33 liter bottle) | Restaurants | 5 | 0 |
| Cappuccino (regular) | Restaurants | 15 | 0 |
| Coke/Pepsi (0.33 liter bottle) | Restaurants | 6 | 0 |
| Water (0.33 liter bottle) | Restaurants | 6 | 0 |
| Average Monthly Disposable Salary (After Tax) | Salaries And Financing | 0 | 0 |
| Mortgage Interest Rate in Percentages (%), Yearly | Salaries And Financing | 0 | 0 |
| Fitness Club, Monthly Fee for 1 Adult | Sports And Leisure | 2.3 | 0 |
| Tennis Court Rent (1 Hour on Weekend) | Sports And Leisure | 3 | 0 |
| Cinema, International Release, 1 Seat | Sports And Leisure | 6 | 0 |
| One-way Ticket (Local Transport) | Transportation | 20 | 0 |
| Monthly Pass (Regular Price) | Transportation | 1.5 | 0 |
| Taxi Start (Normal Tariff) | Transportation | 5 | 0 |
| Taxi 1km (Normal Tariff) | Transportation | 20 | 0 |
| Taxi 1hour Waiting (Normal Tariff) | Transportation | 0 | 0 |
| Gasoline (1 liter) | Transportation | 60 | 0 |
| Volkswagen Golf 1.4 90 KW Trendline (Or Equivalent New Car) | Transportation | 0.0065 | 0 |
| Basic (Electricity, Heating, Water, Garbage) for 85m2 Apartment | Utilities (Monthly) | 1 | 0 |
| 1 min. of Prepaid Mobile Tariff Local (No Discounts or Plans) | Utilities (Monthly) | 320 | 0 |
| Internet (6 Mbps, Unlimited Data, Cable/ADSL) | Utilities (Monthly) | 1 | 0 |
49 rows in set (0.00 sec)
Local_Puchasing_Power_Index = (Average_Disposable_Salary(This_City) / BasketConsumerPlusRent(This_City)) / (Average_Disposable_Salary(New_York) / BasketConsumerPlusRent(New_York))
BasketConsumerPlusRent(City) = sum_of (Price_in_the_city * (cpi_factor + rent_factor))
To calculate indices for a country, we use all entries (for all cities) to calculate country average. Note that it is different from the average of indices for
all cities in that country we have in the database. Due to underlying formulas used (discarding top and bottom 25% of the data before calculating the median value),
sometimes low and high price of an item in one city might look not on par with low and high values of that country - that is due to underlying formulas used
in calculations. So, in country average prices and indices calculations, we are weighting city by number of contributors.
Data deprecation policy
Cost of living section uses the data entered in the last 12 months ago (in a special case, when there are very low number of entries in a city
and the indicators suggest that inflation in a country is low, we use entries as old as 18 months because we think that those data might
not be changed if no one did edit existing data). Other sections which use the same data set uses the same deprecation policy.
Note that some other sections of the website uses different data deprecation policies. Each month, old data are moved to archives and can be pulled with our API.
Our cartographic policy is of portraying the world from a de facto point of view; that is, to portray to the best of our judgment the current reality.
Our partners might have different cartographic policy which could be reflected in software at our website.
If you need more information about the calculations, please Contact Us