Vital statistics

Saturday, 01 October 2022

Sholto Macpherson
Journalist
    Current

    Inflation is difficult to prepare for, causing sudden jumps in supplier inputs, pressure on margins and forcing companies to consider raising prices. However, boards searching for inflationary signals have an ally in the Australian Bureau of Statistics.


    Dr David Gruen, head of the Australian Bureau of Statistics (ABS), spends his hours hunting down enormous datasets he can turn into useful intelligence for the government, business community and wider society.

    Early in the pandemic, the ABS wanted to track changes in spending patterns more closely. It saw that it could play a role in shaping the government’s response to lockdowns — something that would have been impossible two decades ago. “In the old days, ABS officials would walk around supermarkets and write down prices,” says Gruen.

    For many years, the agency has collected scanner data from supermarkets, representing 20 per cent of the basket of goods and services that feeds into the consumer price index (CPI). The ABS thought it could do better.

    As the first lockdowns took effect, Gruen asked the CEOs of the four major banks to share their transaction data. “All four wrote back and said they’d be happy to help,” he says. By adding bank feeds to supermarket transactions, the ABS doubled the coverage across household consumption, a huge increase in the accuracy of the CPI index.

    Directors and data

    Gruen spends his days evaluating the usefulness of various datasets and their potential public value. So what advice can he give to company directors on how to exploit the potential of their own data mines or those of the ABS?

    Gruen says it’s important to keep aware of the latest ABS products. The CPI index is a good example. From 26 October, the ABS will publish CPI data on a monthly basis, using two thirds of tracked prices. “Although not as accurate as the quarterly one, we’ve demonstrated it gives a pretty strong signal about which way inflation is heading,” says Gruen.

    Most advanced countries publish a monthly CPI in line with the International Monetary Fund (IMF) Special Data Dissemination Standards (SDDS). The SDDS sets guidelines for the compilation of statistics for those countries that wish to access international capital markets. Australia is one of only two OECD economies (the other is New Zealand) and the only G20 country that does not produce a monthly CPI.

    Until recently, the costs of producing an Australian monthly CPI have been prohibitive. Enhancements to the quarterly CPI through the use of new data sources have reduced data collection costs, making it possible to produce a more frequent measure of household inflation. In particular, the use of scanner data and web-scraping (automated) data collection techniques provide high-frequency data at a lower cost.

    “Developments in inflation are of considerable interest to the public and to policymakers, particularly in the Reserve Bank, so there is substantial benefit in them getting a more timely read,” says Gruen.

    The agency’s work on the impact of COVID-19 has also produced commercially useful information.

    Most ABS reports are available for download from the agency’s website. However, there is an option for companies that want to view more sensitive ABS datasets. The ABS DataLab contains two integrated datasets. The Business Longitudinal Analysis Data Environment (BLADE) is an economic data tool combining tax, trade and intellectual property data with information from ABS surveys. It is designed to provide a better understanding of the Australian economy and businesses’ performance over time.

    The Multi-Agency Data Integration Project (MADIP) contains information on individuals drawn from the ABS, Australian Taxation Office (ATO), Department of Education, Skills and Employment, Department of Health, Department of Social Services, and Services Australia.

    These datasets are available to trained data analysts and mainly used to guide policy. “Lots of people — mainly in the public and university sectors — can get access to that data and do analysis on it,” says Gruen. “That is another big area of growth for us.”

    One of Gruen’s roles is to educate executives in the public sector. He recommends directors undertake similar courses. “So much more of the economy is run digitally. We are putting together courses for senior managers — obviously not with the technical detail, but so they understand what is being talked about. They need a better sense of what the issues are.”

    A bigger broker

    As well as hoovering up industry data sets, the ABS collects revenue, employment and income data from the ATO, spending trends from household and business surveys, death certificates from the Registrar of Births, Deaths and Marriages, and detailed demographic data from its own census.

    “The digital revolution has increasingly created big data sets,” says Gruen. “All sorts of activity is now intermediated through digital platforms and that generates data. The ABS has been increasingly seeking access to that data because we can generate enormous public value with it.”

    Most levels of government are intent on digitising the machinery of bureaucracy, albeit with varying degrees of success. The federal government has led by example with its digital-by-default agenda and the MyGov portal, launched in 2013. From taxation to payroll collection to drivers’ licences, the faster we convert paper records, the cheaper will be the cost of governing.

    Single touch payroll (STP) is one example. It eliminated the need for companies to reconcile payroll payments at the end of the year and issue PAYG summaries. It also made it much harder for businesses to avoid their superannuation obligations, tightened up tax payments and helps to reduce the size of the “black economy”.

    “Almost all employers are connected up to the ATO via single touch payroll,” says Gruen. “When they run their payroll, how much they pay to each of their employees is recorded by the ATO. Since April 2020, we’ve had access to that data.”

    The ABS turns 10–11 million weekly STP transactions into monthly reports on movements in the labour market. Given that it is based on the employment status of about 12 million employees out of 13.6 million, the ABS can produce a very accurate estimate.

    In July 2016, just before the previous census, ABS held about 1.1 petabytes of data (one petabyte is equal to 1000 terabytes). In five years it has almost quintupled to five petabytes (including backups and cloud-based holdings) and is growing at 10–20 per cent a year, depending on projects.

    “The ABS has seen rapid expansion of its data holdings in the past few years as we have moved beyond our traditional collections to other sources of data, both government and commercial,” an ABS spokesperson said.

    “There has also been additional pressure from the security front, as the threat of ransomware demands a more robust and distributed backup regime.”

    Fortunately, this demand for data hasn’t come at a major environmental cost. ABS data centres in Canberra use 100 per cent renewable power and follow ISO-based protocols to avoid water wastage.

    Stats behind the stats

    15,187,394
    website sessions

    11,429,244
    calls to API service

    15,520
    DataLab sessions

    531
    statistical releases for FY22

    Source: ABS Annual Report 2020–21

    Data security

    Not only has the volume of data increased, so has the ease of access. Petabytes of commercial data sit in massive databases connected directly to the internet. High-speed data links can whisk that data away to ABS data centres. Ask the right people and you can access the secrets within these troves. Part of Gruen’s job is to convince people it is worthwhile sharing data with his agency. But businesses, especially banks, can be protective of this information. Why were the Big Four so willing to share?

    “It’s part of their social responsibility to be willing to share that data provided we’re not direct competitors with them, which we’re not,” says Gruen. “The pitch I have used is that we are in a position to create substantial public value out of this data.”

    The data behind the locked doors of ABS data centres contain a fortune of personally identifiable and highly sensitive information — wages, medical records, personal income, net wealth. The ABS made headlines during the 2016 census when a massive cyberattack made its website inaccessible. The prospect of census secrets accessed by a foreign state for purposes unknown unnerved bureaucrats and citizens alike. In the aftermath, the ABS worked with the Australian Cyber Security Centre, a part of the Australian Signals Directorate, to bolster its defences. “We are trying to make sure our defence is state of the art,” says Gruen. “But there are no guarantees in cybersecurity — it’s a best endeavours basis.”

    The ABS was also targeted by the same attack that took the 2016 census down, a distributed denial of service that was the largest ever conducted in Australia. “We have done a lot [of work to ensure security], but never say never,” says Gruen.

    The 2021 census was executed without interruption, but not without incident.

    “There were close to a billion attacks on census day — most of them by bots — and they were all repelled successfully,” says Gruen.

    The ABS has no qualms about the security of cloud computing providers, using data centres run by Amazon Web Services and Microsoft Azure to manage the load. The digitisation of payroll forced its hand. “Bringing the STP data in the door taught us new ways of doing things, which involve [using] the cloud,” says Gruen.

    With its steadily increasing cache of government and enterprise records, the ABS will reveal deeper knowledge of who we are and what we do — and become an increasingly valuable target.

    Benchmarking roadmap

    The ABS is expanding its ability to collect data and report back to businesses directly. It is building a web application to harvest financial data from online accounting software for surveys.

    “Imagine a world where we say to small businesses, if you allow us to access your accounting software, we will extract those elements that we need for our business surveys,” says Australian Statistician and Australian Bureau of Statistics head Dr David Gruen. “If we’ve got 100,000 firms, we could say to them, your turnover puts you at the 68th percentile of rural NSW firms in your sector.”

    He says the concept is a work in progress and the agency has made no decisions on how much detail the benchmark report will contain. “It will partly depend on how many sign up to this.”

    But is a benchmark report enough incentive to give the ABS this access? “Well, there are two carrots,” he says. “Our legislation means we can compel people to fill in our surveys. So if you get chosen as a small business, that can take up quite a lot of time. Not only will you get benchmarking, but if you tick the box, you can be done with the whole process in five minutes rather than your CFO having to get all this information and key it into a form.”

    Gruen is aware that the agency needs to tread carefully. “We have to be mindful of community expectations,” he says. “We want to do things that make people’s lives easier, collect useful information in a way that’s transparent, not in a way people find alarming. We’re doing this fairly deliberately and picking off the things we think have the most value. It’s partly a function of what new datasets become available.”

    The ABS is making people aware it is on the lookout for these datasets. ABS staff will also volunteer ideas about particular datasets and how to get new types of information.

    Gruen is excited about the possibilities of insurance data. “That might be valuable for informing hazard analysis on natural disaster areas,” he says. “It really is a case of seeing what is available and partly what are the issues of the day.”

    Improving timeliness is another area of interest. The ABS used to report mortality with a 20-month lag to allow time for coroners’ reports. “When you’re in the middle of a pandemic, that’s not very helpful,” notes Gruen. The ABS began publishing provisional mortality data with a two-month lag based on doctor-certified deaths. “That turns out to cover about 85 per cent of deaths, so you can get very high-quality information.”

    Latest news

    This is of of your complimentary pieces of content

    This is exclusive content.

    You have reached your limit for guest contents. The content you are trying to access is exclusive for AICD members. Please become a member for unlimited access.