Dimensions spoke with Scott Bauguess, who spent 12 years at the SEC, most recently as the Deputy Director and Deputy Chief Economist in the SEC’s Division of Economic and Risk Analysis (DERA), for insight into the SEC’s current use of XBRL and the future of the SEC’s analytical program, including machine learning for reviewing narrative disclosures.
How does Inline XBRL further the benefits of machine-readable structured data for the SEC, filers, and the market?
One of the biggest benefits is that Inline XBRL makes the data interactive in a way that XBRL was intended to do from the onset—but did not initially because of how filers reported a separate instance document with that information. Now when you go to an Inline XBRL filing, you see the information displayed right there. It is easier for someone preparing the documents to find mistakes, as they are rendered right into the filing. It reduces silly types of errors, such as sign errors, because you see them rendered in the document. It is very intuitive. What I think may really advance the use of XBRL data is vendors embedding it into their own applications. If you look at an
XBRL filing on SEC.gov, in the EDGAR filing system today, it is a really beautiful representation of the information. You can scroll over figures, click on information, see the metadata, and better understand what the values are.
There is no reason why a vendor could not enhance that and start adding its own content in a way the SEC cannot. For example, instead of having the past four quarters of revenue, you could have the past ten years of revenue. The SEC cannot do those types of enhancements because they would potentially be perceived as investment guidance. But a vendor can certainly do that. This was the big hope at the SEC when I was there. The more you get people using the data directly, the better that data becomes.
The SEC has focused on modernizing disclosure. How does XBRL fit into those efforts?
Increase in the interactivity of the data is, I think, the single biggest benefit that comes from structured disclosure. For example, the disclosure divisions at the SEC—Trading and Markets, Investment Management, Corporation Finance—are writing the rules that create the disclosures; but among the biggest users of that data at the SEC are the Office of Compliance, Inspections, and Examinations (OCIE) and Enforcement. The more feedback they give, when that data is more directly usable from the filings themselves in all of the tools and analytics, the better the three policy divisions can augment change or enhance those disclosures.
You were the Deputy Director of the SEC’s Division of Economic and Risk Analysis (DERA). What is its role in the agency?
DERA is the economics unit of the SEC. It does four things. First, it provides economic analyses to support rulemaking. Second, it provides litigation support to the Division of Enforcement about the economics of the litigation—for example, helping to understand damages and identifying the magnitude of issues and problems. Third, DERA engages in risk assessment: trying to identify problem areas in markets, such as likely areas of misconduct or likely areas of market risk. Fourth—and embedded in each of the other three activities—are data services provided by a large number of DERA staff, such as structured data support and tool development. A lot of these activities overlap with other offices and divisions, but the biggest site of these activities in the SEC is DERA.
One view of my role as Deputy Director was as the chief operating officer of the division. There is a director and chief economist who is the appointed leader of the division. But they usually serve for a year or two, maybe three at the most. The deputy position is the permanent position that maintains continuity of the division’s operations through administration and leadership changes at the SEC.
How has DERA itself evolved with the use of XBRL and other forms of structured data?
Early on, the use of structured data was fairly ad hoc, at least in the context of Form 10-K. A lot of it related to solving problems, such as: What pension-fund liabilities are buried in the footnotes across all companies filing with the SEC?
That became an easier task with XBRL because of the tagging in the footnotes. Over time, the use of XBRL expanded. DERA developed tools that let anyone at the SEC use XBRL data from corporate filings directly in a way that is appropriate for agency staff: disclosure reviewer, operations person, investigator, examiner. These tools let them search for information according to what their role is and how they want it to be presented. The Corporate Issuer Risk Assessment (CIRA) model is one example. When I was leaving the agency, DERA had just rolled out a data-query viewer to the entire Division of Corporation Finance, and it was on track to be released to the entire agency.
What is the CIRA tool? Why should filers and counsel be aware of it?
Imagine a tool that could bucket certain indicators at a company according to what an examiner, investigator, or disclosure reviewer would want to know. CIRA does that. It organizes metrics according to liquidity, leverage, performance metrics, operating margins, turnover, measures of working capital, valuations, accounts receivable, revenues, etc.
There are any number of metrics that are widely used at companies. CIRA aggregates them all into types of disclosures and provides outlier indicators for those measures. For example, you may have ten working-capital measures, and among those ten measures a company may present seven that are 90th percentile outliers. That immediately gives a signal to a disclosure-review staffer or examiner that something is out of the ordinary there. Why are they outliers?, examiners might ask. It gives them a place to look in a financial statement.
Generally, outliers in a company tend to be bunched in particular areas. They highlight areas of concern that investors may or may not be aware of or responding to.
Does CIRA use XBRL?
Originally it did not. It was built to use data from a data aggregator, such as Standard & Poor’s Compustat. The reason was that XBRL did not have a long enough history. Most data aggregators go back decades, which is important for building models and looking for trends. That is less of an issue now that there are eight or nine years of XBRL data.
There still are some normalization problems with XBRL data. Data aggregators do a better job of normalizing and making data comparable across companies than filers do themselves in XBRL reporting. However, CIRA has over the years increasingly incorporated XBRL data in its dashboard. A lot of it is cross-sectional data or artifacts that are not otherwise reported by data aggregators. It becomes supplemental information.
Whether XBRL replaces data aggregators in the future is still up in the air. I think that there are a lot of issues with XBRL data that make it somewhat prohibitive in replacing what you get from data aggregators in the market.
As the SEC increasingly relies on XBRL for reviews and investigations, are there risks for companies that file without paying attention to this trend?
The risks for companies are what they always are in disclosures. If companies are not careful, their XBRL disclosures are more likely to generate flags that may cause the SEC to reach out and ask about it. I do not think any issuer wants to create flags that have no merit, but if they report their XBRL data poorly, that is in fact what it does.
So poor tagging can raise red flags and perhaps harm capital-raising?
I am not saying poor XBRL tagging necessarily raises red flags at the SEC. It creates anomalies in reporting that can be confusing, leading the SEC staff to ask questions. However, from the market point of view, it is detrimental to all filers to have inaccurately reported information. When market analysts are assessing the creditworthiness of a company while trying to make a recommendation, if the data is poor, that is not in a company’s favor.
If you cannot get good answers from financial data, market analysts could ignore those companies. Analysts have finite time and bandwidth; they will not necessarily spend the time to figure out what is wrong with the data, especially for smaller companies with less scope to attract institutional investors’ attention. The easier solution is to stop looking at the company.
What are some of the problems that you have noticed with XBRL tagging? Have you seen improvements?
The biggest problem with XBRL has always been in tag selection. Companies use extensions at a far greater rate than they need to. A lot of that stems from the fact that, early on, they had words describing line items in financial statements that they wanted to match precisely to an XBRL tag, and if it did not precisely match a standard tag, they customized the tag. The SEC allowed the practice for a very long time, and a company is not going to change it unless the SEC tells them to. A lot of that exists today. While extensions have dropped monotonically each year since the XBRL rules were rolled out, the rate is still very high—probably not for good reasons. Extensions really hurt the comparability of XBRL data, particularly for people who want to use the raw data to make inferences across companies.
How does DERA monitor XBRL quality and help companies improve their XBRL tagging?
Historically, DERA has annually assessed XBRL tagging practices across companies over time. At least up until I left, each year it issued a report to the Division of Corporation Finance on the state of XBRL tagging: who was doing a good job and who was not doing a good job, sorted by types of filers (large, small, medium) and industries. DERA provided this information to the Division of Corporation Finance to let it assess what, if anything, should be done.
A while ago, the SEC staff issued observations on XBRL tagging and some “Dear CFO” letters. Do you think the staff will do that kind of outreach again with Inline XBRL?
I would be surprised. I think there are a lot of views within the SEC that things are just fine. It was never the intent of many there to tell filers how to tag their filings. They have allowed a lot of discretion. I do not think standardization was emphasized as a goal of using XBRL data. One view is that filers were doing this before XBRL and we did not have a problem with it then, so if they want to be non-conventional after we made them use XBRL, why should that matter?
I think that for further improvements in XBRL to happen, there needs to be a change in orientation at the SEC about the goal. That will require someone to say that the SEC should increase standardization of reporting so that information is more comparable. Until that happens, I think that we will continue to see only monotonic improvements in XBRL-tagging quality, and most of that will probably come from newer companies that benefit from the history of older companies and are starting fresh in tagging their financial information. I do not see existing companies changing their practices unless the SEC comes out with specific direction, as it did with the “Dear CFO” letters several years ago.
What do you see as the future of expanding XBRL requirements, such as in the proxy statement?
I think there’s good news when it comes to new disclosures. Each new disclosure rule being written today includes
considerations on how to structure that information in XBRL. The SEC has been pretty thoughtful about what that should look like. It is far easier to make those changes to reporting format when you are modifying existing disclosures or asking for new disclosures.
I think it is relatively bad news for disclosures that already exist, such as the proxy statement. I certainly was not aware of any initiative before I left the SEC (and I am not aware of any since) that would go back and wholesale restructure a document like that. It could be complicated to do the proxy statement with the XBRL taxonomy. A lot of development work would be required; it would be similar to a 10-K filing, where you would have to build and maintain a taxonomy to describe the different ways in which you report information. But there would also be a multitude of benefits.
There is a proposed rule out for public comment on updating filing fees. In that rule there are some questions about whether the SEC should further structure a filing, such as the S-1 registration statement, to tag additional information.
That is an opportunity to expand structured data, particularly XBRL data. This is a good barometer for what the SEC is likely to do in the future under the current administration.
You gave an interesting speech called The Role of Machine Readability in an AI World at the Financial Information Management Conference in 2018. You have said you prefer the term “machine learning” to “artificial intelligence.” How is the SEC using machine learning to do its work?
It is being used in two ways. One is for narrative disclosures: a lot of unsupervised learning methods, topic modeling, machine-learning methods are being used to identify latent trends in documents. These are things that a human may not otherwise detect, absent a mathematical model that picks out commonalities in words and phrases within and across registrant filings.
The other area is in using supervised machine-learning methods, such as random forest models, to enhance the ability to discover relationships that you would not find if you were developing models from original human design only. These machine-learning algorithms can look through hundreds of models to find which works best. That is adding some power to the existing modeling methods at the SEC.
To go back to topic modeling and some of the narrative disclosures that are being analyzed: The one that had a lot of success when I was there was looking at Form ADV, Part II. This is a plain-English disclosure to investors from
investment advisors that describes their business model. When topic modeling was applied to those forms and then the latent trends were mapped onto ten years’ worth of examination data, the SEC staff found strong correlations to
types of disclosures and topics that led to enforcement referrals. When that was first run, the model predicted the likelihood of an Enforcement referral at a rate five times greater than a random selection, or what humans had historically picked out. That was very encouraging.
As I left, the model had just finished its second round of calibration with the SEC’s New York regional office and its examination planning staff. My hope is that they are continuing to do that and using it as part of the selection process
for examination candidates. When you can visit only 15% of registrants in a year, analytics can really help you identify which 15% you should visit.
Do you foresee the progression of machine learning at the SEC to look at disclosure filings, such as 10-Ks, 8-Ks, and proxy statements?
That is a great question. Narrative disclosures such as the MD&A or Description of Business in SEC filings are prime targets for doing this type of analysis. When I was at the SEC, prototyping had been done on the MD&A section of
the 10-K. Prototyping had also been done on TCRs—tips, complaints, and referrals—that had come into the SEC. This is a narrative disclosure in which someone says they think fraud is happening.
You can extract signals from any narrative disclosure. The question is how you calibrate those signals to something you care about. With Form ADV, it was past examination results. With TCRs, you can calibrate it to after-the-fact
dispositions of those TCRs. That is where the value is. There were all sorts of avenues the SEC was going down when I left. It was being popularized not just in DERA but in many areas of the SEC where they have analytical units.
How do you think the European Single Electronic Format (ESEF), the XBRL mandate of the European Securities and Markets Authority, will benefit regulators such as the SEC?
When I was a regulator, the endorsement of a technology by another jurisdiction reaffirmed why and what we were doing. What I found interesting is that after the SEC mandated the use of XBRL, other jurisdictions not only followed suit, they also moved ahead of the SEC in their use of XBRL, and that in turn informed the SEC on how it might expand its use of XBRL.
Inline XBRL is a consequence of watching Europe and seeing how it was being used in the United Kingdom and how it was likely to be adopted elsewhere. That really helped the adoption of Inline XBRL at the SEC. Inline XBRL was a
four- or five-year project that just bumped along until it gained momentum. That is where I think the European Single Electronic Format (ESEF) XBRL mandate has helped and will continue to help the SEC.
In terms of how the new ESEF XBRL mandate will impact European investors, I think it is a very positive development. As I understand the requirements, it is similar to the US adoption of Inline XBRL, but with at least two notable differences. The first is that filers will need only to block-tag their footnotes. This will make reporting easier and less burdensome, but it will also make it more challenging for investors to find and aggregate important nuggets of information often buried in the footnotes.
The second difference is that extensions will need to be anchored to the core taxonomy element that has the closest accounting meaning. It is hard to overstate what a good decision that was. Anchoring provides valuable meta data about uniquely reported items, making reporting elements easier to machine-interpret and aggregate. And by requiring anchoring, it is likely to limit the unnecessary use of extensions, i.e., those due to lazy tagging effort or
motivated by obfuscation. As such, it will provide a nice benchmark for US regulators to assess reporting efficacy.
How are you applying your SEC experience in your new academic role?
When I knew I was leaving the SEC, I did not know what I wanted to do. I wanted to do something similar, just not at the SEC. What I have done is to recreate my former job within academia. The plan is to continue to comment
and work on SEC rules but do that by engaging academics. I am building infrastructure to get academics more involved in SEC policy-making. This will make it easier for them to comment on proposed rules and take part in
research that may help the SEC.
By Scott Bauguess, McCombs Business School, University of Texas at Austin
Scott Bauguess spent 12 years at the SEC, most recently as the Deputy Director and Deputy Chief Economist in the SEC’s Division of Economic and Risk Analysis (DERA). He is now on the faculty of the McCombs Business School, University of Texas at Austin. He is the director of the Securities Markets Regulation program in its Center for Enterprise and Policy Analytics.
Dimensions spoke with Dr. Bauguess for insight into the SEC’s current use of XBRL and the future of the SEC’s analytical program, including machine learning for reviewing narrative disclosures.
This interview expresses the views of Dr. Bauguess and does not necessarily reflect the views of any current or past employers.
To read the full article in Dimensions Vol. 2020, No. 2, click here.
Reach out to jump start a partnership that will bring speed, security, accuracy and efficiency to all of your complex content and communication requirements.