Finding PPP Loan Fraud

Quickly Discover Hidden Connections in PPP Loan Data Using Senzing

Jeff Jonas
7 min readSep 23, 2020

--

At Senzing we have created the first real-time AI for entity resolution. We make it quick and easy to accurately combine data about people and companies from different data sources. No other technology that exists today can do this in real time, at scale, with this level of accuracy without any training, tuning or experts. It is also the most affordable option available!

You can try it right now on Paycheck Protection Program (PPP) data in under 20 minutes.

To help make this fast and simple, we have prepared three Senzing-ready .csv files filtered to contain Las Vegas related records.

[NOTE: Instructions for running the whole PPP loan file are located at the bottom of this blog post.]

If you need any help along the way, reach out to support@senzing.com for some free assistance.

IMPORTANT DISCLAIMERS

  • The Senzing-ready .csv links provided are snapshots from the past, so the information is out of date. If you are doing real research, be sure to download the latest files (see links at bottom).
  • Many organizations have multiple legal entities, sometimes similarly named. Without more data, Senzing may match these entities if they are located at the same business address. Such duplicates are likely legitimate. Note: as more data is loaded, these overmatches begin to automatically self-correct, which is a unique capability of Senzing.

Loading and Exploring Las Vegas PPP Data

1. Download and install the free Senzing App here. [No personal data flows to Senzing, Inc.]

2. Launch the Senzing App.

3. Create a Project.

  • Select “Projects” (left toolbar, icon with the hammer).
  • Select “Add Project.”
  • Name the project whatever you like e.g., “PPP Las Vegas.”
  • Select “Create.”

4. Load the PPP file into Senzing.

  • Download the “PPP_Loans_Over_$150k_LasVegas.csv” that we have prepared here.
  • Select “Data” (left toolbar, icon with the cylinder).
  • Drag and drop the “PPP_Loans_Over_$150k_LasVegas.csv” file onto the canvas.
  • Click “Load” on the card.

5. Review the results.

  • Once loading is complete, Click “Review.” (Once loaded, “Load” will change to “Review”)
  • Explore the Duplicates — records Senzing thinks belong to the same organization.
  • On the far right click the little “expand” icon (looks like a small blue clock) that appears as you hover over any of the “Other Data” column entries.
  • Once finished exploring “Duplicates,” click on “Possibly Related.” The Match Key column explains why they are related.
  • Click on any “Entity ID” (left column in chart) to see the entity’s resume.

Highlights:

  • Notice in the top blue bubble there are 40 duplicates.
  • Looking over these duplicates you will notice some are probably false positives e.g., these three entities “NG WASHINGTON“, “NG WASHINGTON II“ and “NG WASHINGTON III“ are probably different legal entities — each eligible for a PPP loan. Records like this match because of the name and address similarity.
  • You may notice other duplicates that look like identical legal entities — these are examples where further human analysis is required.
  • Select “Search” (left toolbar, icon with a magnifying glass) and search for this address: “3130 S Durango Dr STE 400 Las Vegas.” Click any of the possibly related entities and you will see something like this (once you click on any of the possibly related entity names):
  • Click the “Show Match Key” in the lower right corner and you will see how these three entities “BOYACK AND ASSOCIATES INC”, “BAI LAS VEGAS LLC” and “BAI NEVADA, LLC” are related.

Add Reference Data to Improve Accuracy

Reference data are carefully curated data sets that can be used to improve entity resolution accuracy. For this demonstration, we will be using a publicly available file called the National Provider Index (NPI) which contains a list of US health care providers curated by Health and Human Services.

1. Load the NPI file into Senzing.

  • Download the “NPI_Orgs_LasVegas.csv” that we have prepared here.
  • Select “Add Data Source.”
  • Drag and drop the “NPI_Orgs_LasVegas.csv” file onto the canvas.
  • Click the “Load” on the card.

2. Review the results.

  • Once loading is complete, click “Review”
  • Once loaded, click “Review” on the PPP-LOANS … card.
  • Notice there are now 41 duplicates in the PPP data — recall, before loading the NPI file there were only 40. Which match is new? Hint: Use the More button to reveal records from other data sources that may have contributed to the matching decision.
  • Notice there are now two possible duplicates — recall, before loading the NPI file there were zero.
  • Click on the two (2) “Possible Duplicates.” Can you figure out what Senzing learned that caused it to change its mind about these matches?

Highlights:

  • Using the NPI reference data, these three PPP records came together: “BAI LAS VEGAS LLC“, “BOYACK AND ASSOCIATES INC”, and “BAI NEVADA, LLC”. Why? When the NPI record revealed BAI was a DBA (doing business as) “BOYACK AND ASSOCIATES”, Senzing’s entity-centric technology, caused Senzing to reevaluate its earlier decision and improve it, in real time.
  • In a similar manner, the NPI reference data surfaced to possible matches — these have close names at the same address.
  • Other popular reference data that can significantly improve matching results are commercially available from data providers like Dun & Bradstreet, Moody’s and OpenCorporates.

How to Combine Other Data to Improve Context

Combining additional data from other public and private sources is easy too. For example, publicly available data from the US Department of Labor Wage and Hour Compliance Actions can be easily added to discover which PPP recipients also have labor violations.

3. Load the DOL Compliance Actions file into Senzing.

  • Download the “Dept_Labor_Whisard_LasVegas.csv” that we have prepared here.
  • Select “Data.”
  • Drag and drop the “Dept_Labor_Whisard_LasVegas.csv” file onto the canvas.
  • Click “Load” on the card.

4. Review the results.

  • Once loading is complete, click “Review.”
  • Once loaded, click “Review” on the PPP LOANS … card.
  • In the upper right area of the screen you’ll see “PPP Loans” in a drop-down. to the right of this you will see the word “NONE”. Click this drop-down to change “NONE” and to the “DOL — WHISARD” data source.
  • Now click in the middle of the blue circles to see the matches between these data sources.
  • Notice the CASE_VIOLTN_CNT values (Case Violations) on the far right.
  • Scrolling down, use the blue More button on the left side to reveal records from other data sources that may have contributed to the matching decision.

Highlights:

  • Before loading the US Dept of Labor file there were only two possible duplicates. Now there are three. To see this, change the “US DOL — WHD” data source back to “NONE”. Then click on the “3” possible duplicates. Take a look, one of these is new. Take away: although this is not considered reference data, new data from any source can be used to help improve past, present and future matches.
  • While on the same PPP Possible Duplicates screen, check out the Match Key column. Notice all of the rows have an “-NPI_Number” which means these values were different. Had these not disagreed, Senzing would have considered these duplicates.

SHAZAM!

Look what you have been able to do so quickly! Unlike other technologies that take a long time to set up and configure, Senzing is so easy. Feel free to entity resolve your data e.g., your contacts, Salesforce accounts, vendor file, marketing list, etc. If you want additional info on getting started, check out this article.

Hope you enjoyed Senzing. We would love to hear any feedback, especially suggestions on how to make it better. You can reach us here.

Thank you.

BONUS SECTION

  1. Just for fun, check out these additional Senzing-ready files, filtered for Las Vegas:

2. Instructions for running all the PPP loan data:

The Senzing API, our main product, is for developers. Our technology makes the complicated task of entity resolution trivial for programmers. Senzing is real-time and scalable to billions of records. More on our unique technology here.

If you are not a developer, the simple Senzing App is for you. While 100k records are free, an affordable license upgrade is available here.

To speed up your full-file PPP project, here are some key links. Use the website if you need current information for real work. Otherwise, if you are just experimenting, try our Senzing-ready links which are out of date snapshots:

PPP Loans over $150k | Source Link | Senzing-ready Link

National Provider Index | Source Link | Senzing-ready Link (filtered for organizations)

Dept of Labor Compliance Actions | Source Link |Senzing-ready Link

Medicare Supplier Directory | Source Link | Senzing-ready Link

Physician Compare | Source Link | Senzing-ready Link

OIG Exclusions | Source Link | Senzing-ready Link (filtered for organizations)

REFERENCE LINKS

Senzing’s Developer Page

Uniquely Senzing White Paper

Entity Resolution Processes White Paper

Slow Motion Entity Resolution Video

Entity-Centric Learning

Architecture Pattern for Perpetual Insights

Our Partners & Customers

--

--

Jeff Jonas

Jeff Jonas is founder and CEO of Senzing. Prior to Senzing, Jonas served as IBM Fellow and Chief Scientist of Context Computing.