With GDPR and all the fun that follows it, I’ve been needing some tools to work out where this pesky data is. Yes there are some nice new features to SSMS that help and yes RedGate is getting in on the action to with some of there new offerings, but to get started I wanted to take a look myself at what we had.. and how bad was it with regards to PII (Personally identifiable information).

After playing around with some scripts for a while, I ended up writing some functions. As they all link together nicely, it made sense to put them into module.

Say hello to DBAPii! At present there are only 3 functions.

DBAPii PowerShell Module

Find-PiiColumn

This is used to help find columns based on there name. I started using this to try and find tables and columns with PII by doing commands like:

$pii = "email", "emailAddress", "firstname", "surname", "fullname"
Find-PiiColumn -SqlInstance DEV01 -Contains $pii

You can also do wildcard searches like

$pii = "email", "name"
Find-PiiColumn -SqlInstance DEV01 -WildCardSearch $pii

which would find any columns with either email or name in them..

NOTE: The awesome folks at DBATools did some testing for me on this one and found if you have 100s of databases on an instance.. this runs like a dog! I need to re-write some of the logic to be done in TSQL to speed it up!

Find-PiiEmailAddress

This is a bit more meaty and not pretty! but I couldnt think of another way to do it! Basically.. this does a SELECT TOP() FROM every table in the database then uses regex to check every result in every column to see if an email is in there! (As i said not pretty!) but for times were you have to know you can certainly be a lot more confident in this, than you can on doing just a search with Find-PiiColumn. You get to specify the value for the TOP as well so if you want to search only the top 2 you can, if you want 2000 go for it!

Find-PiiEmailAddress -SqlInstance localhost -Database myDB

NOTE: Ive not had a chance to test this much! it works for normal tables, but I haven’t had a chance to try any of the newer styles (temporal,cdc etc)

Update-PiiEmailAddress

So say your moving your database into DEV or UAT and want to remove all those production real peoples email address! you can now pipe Find-PiiEmailAddress into Update-PiiEmailAddress, this will create a random GUID local-part and you can set the domain to what you like in the parameters.

Find-PiiEmailAddress -SqlInstance localhost -Database myDB | Update-PiiEmailAddress -SqlIsntance localhost -Database myDB

As more things pop up im hoping to keep build out the module as it seems a really interesting and relevant set of functions for people to use at the moment. Any comments or fork requests are welcome to keep it moving!

 

Advertisements