DATAMASKING
Almost in every company i have worked for, datamasking is a must-have procedure for test data generation, especially after restoring production data to a prelive or day-1 database. In common sense just truncating the sensitive data or updating the columns to NULL maybe enough for security purposes. But in some individual cases application logic needs to continue which depends on the data availability. In such cases there are two main roadmaps, hashing the real data and un-identify it or some random data may be generated for representing the real sensitive data. The generated data may also ensure the formatting of the known structures such as credit card information, postal code or telephone number of the customers.
PASSWORD GENERATION
After applying the new password policy to our database, forgotten passwords by the end users are not recovered that easily because of the password complexity functions (by applying password verify function) and the password_reuse_max parameter of the profile. This means that, the portal used by the level-1 support is not enough as the procedure lies behind the button is simply changes the password to a default value which is not changed from the beginning of the procedure at all, which causes ORA-28007: the password cannot be reused exception.
TEST DATA GENERATION
In order to test any functionality or any new features by myself, generating data for newly created tables is a necessity. Formatted varchar2 and number columns are very handy to test some functionalities and how some functionalities behave in large amount of structured table data.
Dictionary views can be used for this purpose but If it is not enough for the individual test conditions then generation a random data within the desired format is crucial.
DBMS_RANDOM PACKAGE
For all the mentioned purposes, oracle database has a built-in package which is DBMS_RANDOM to generate random data. There are several functions in this package which can be used to obtain same results with some manipulation of the output values. Main functions to produce random number would be;
DBMS_RANDOM.VALUE() whichs output value is NUMBER datatype, this means you can produce up to 32 digit numbers.
select dbms_random.value() as result from dual; RESULT ------ 0.33363563290178533954768590716355821427
DBMS_RANDOM.VALUE(arg1 number, arg2 number) returns a random value between the supplied limits. The following sql should return a random number between 1 and 20.
select dbms_random.value(1,20) as result from dual; RESULT ------ 14.25291834803147861791906299688905822678
DBMS_RANDOM.STRING() function can also be used in order to generate a random character or a string. Function has two parameters one for the type of characters to be used and the second parameter for the length of the string. The following function definition is copied here from and Oracle 11gR2 databases DBMS_RANDOM package spec. It explains itself very well and nothing needs to be added.
FUNCTION string (opt char, len NUMBER) /* "opt" specifies that the returned string may contain: 'u','U': upper case alpha characters only 'l','L': lower case alpha characters only 'a','A': alpha characters only (mixed case) 'x','X': any alpha-numeric characters 'p','P': any printable characters */ RETURN VARCHAR2; --string ofcharacters
So, if you need to produce some random string with 8 characters long and all characters are alpha numeric and lower case then it is easy by using the DBMS_RANDOM.STRING() function:
select dbms_random.string('l',8) as result from dual; RESULT ------ yhxctomv
DBMS_RANDOM BY EXAMPLES
All the explanations of the examples can be found in the document that i shared from here. You can also find the link at the bottom of this post to the same document.
-------------------------- --masking the card numbers --Ex.1 -------------------------- select lpad(round(dbms_random.value*power(10,4)),4,0) || '-' || lpad(round(dbms_random.value*power(10,4)),4,0) || '-' || lpad(round(dbms_random.value*power(10,4)),4,0) || '-' || lpad(round(dbms_random.value*power(10,4)),4,0) as card_number from dual connect by level <=5; /* CARD_NUMBER ----------- 2877-6639-0728-5456 6026-6002-2218-9038 7679-8441-0899-2826 8294-6783-6110-7988 1836-0407-9206-3333 */ -------------------------- --masking the card numbers --Ex.2 -------------------------- select ltrim(to_char(dbms_random.value(1,9999),'0000')) || '-' || ltrim(to_char(dbms_random.value(1,9999),'0000')) || '-' || ltrim(to_char(dbms_random.value(1,9999),'0000')) || '-' || ltrim(to_char(dbms_random.value(1,9999),'0000')) as card_number from dual connect by level <=5; /* CARD_NUMBER ----------- 1558-9846-7194-5325 5109-3233-0641-9209 3081-5946-9840-6615 4400-9638-6333-9113 2928-9883-1771-0465 */ -------------------------- --masking the card numbers --Ex.3 -------------------------- select ltrim(replace(to_char(round(dbms_random.value*power(10,16)),'0000,0000,0000,0000'),',','-')) as card_number from dual connect by level <=5; /* CARD_NUMBER ----------- 0157-8125-6418-6025 3829-9039-1357-9048 2876-1086-5371-8152 2775-1748-2591-2523 2058-2404-1101-5320 */ -------------------------- --masking the card numbers --Ex.4 -------------------------- select substr(abs(dbms_random.random),1,4) || '-' || substr(abs(dbms_random.random),1,4) || '-' || substr(abs(dbms_random.random),1,4) || '-' || substr(abs(dbms_random.random),1,4) as card_number from dual connect by level <=5; /* CARD_NUMBER ----------- 8639-7576-1359-3965 1317-1525-2526-1796 1043-5881-1000-7113 2106-3239-8662-3769 1461-7473-5870-6829 */ -------------------------- --masking the phone number --Ex.5 -------------------------- select '+' || round(DBMS_RANDOM.VALUE(1,99)) || '-' || round(DBMS_RANDOM.VALUE(10,99)) || '-' || round(DBMS_RANDOM.VALUE(1000000,9999999)) as phone_number from dual connect by level <= 5; /* PHONE_NUMBER ------------ +8-44-9146987 */ ---------------------- --masking the postcode --Ex.6 ---------------------- select round(dbms_random.value(1000,9999)) || '-' || dbms_random.string('U',2) as postcode from dual; /* POSTCODE -------- 4997-QP */
What if, you have a password verify function which commits the passwords will be at least 8 characters long and must contain alphanumeric characters and this password complexity merged with a profile which has a password_lifetime of two month and password_reuse_max is four. If there is a predefined automatic case which explained detailly in the paper mentioned before which needs random password generation then there is the example which can be used;
----------------- --random password --Ex.7 ----------------- select DBMS_RANDOM.STRING('A',1) || round(DBMS_RANDOM.VALUE()*10) || DBMS_RANDOM.STRING('X',6) as password from dual; /* PASSWORD -------- w7N3C1YG */
TEST DATA GENERATION
I generally use two different methods while generating test data to fill the test tables. One of them is by using the DBMS_RANDOM package and the other is filling the bulk data in the columns with the same output as you can find in the following examples.
In this first part of the following example, the code tries to simulate a username bu using lower case string and with random lengths between five and fifteen. The firstname starts with uppercase by using INICAP() function and the lastname is fully in uppercase by using the UPPER() inline function. The second part is not that clever and it just creates the same data over and over which can substitude a customer name or a username.
---------------------- --test data generation --Ex.8 ---------------------- select initcap(dbms_random.string('L',round(dbms_random.value(5,15)))) || ' ' || upper(dbms_random.string('L',round(dbms_random.value(5,15)))) as name from dual connect by level <= 5; /* NAME ---- Abzvsidgbcfa AGUGIR Wvuogptkxwhdwa IPOOXTVBLLCNPV Yiwcgh SGPFKJYCDISO Radshiyidcrst ZNKNSEYUZXVWY Daxeqzugq LKJILZJEYULVI */ select initcap(lpad('x',round(dbms_random.value(5,15)),'x')) || ' ' || upper(lpad('y',round(dbms_random.value(5,15)),'y')) as name from dual connect by level <= 5; /* NAME ---- Xxxxxxxxxxxx YYYYYYYYYYY Xxxxxxxxxxxxxxx YYYYYYYYYYYYY Xxxxxxxxxxxxxx YYYYYYYYY Xxxxxxxx YYYYYYYYYYY Xxxxxx YYYYYYYY */ select initcap(lpad('x',9,'x')) || ' ' || upper(lpad('y',9,'y')) as name from dual connect by level <= 5; /* NAME ---- Xxxxxxxxx YYYYYYYYY Xxxxxxxxx YYYYYYYYY Xxxxxxxxx YYYYYYYYY Xxxxxxxxx YYYYYYYYY Xxxxxxxxx YYYYYYYYY */
GENERATING MEANINGFUL DATA
Upto now, the generated data was completely dummy. For example while masking the credit card number, it wasn’t important if this was a valid card number or not, or if the names are real names or not. All the data generated were dummy random data which does not make any sense, they are just string or numbers formed by number characters or digits.
But by using the following example random data can be generated which makes sense (or a little sensible than the previous methods).
In this small example, there is a lookup table (which can be extended as far as the individual case needs) and the phone number generator picks up a random country code from the lookup table to generate the phone number. Every random phone number generated will have a valid country code by using the following example. Of course this example can be extended for the area codes as well.
-------------------------- --generating sensible data --Ex.9 -------------------------- create table t_country_codes (key number(2), code number(2), country varchar2(20)); insert into t_country_codes values (1,1,'United States'); insert into t_country_codes values (2,31,'Netherlands'); insert into t_country_codes values (3,44,'United Kingdom'); insert into t_country_codes values (4,49,'Germany'); insert into t_country_codes values (5,90,'Turkey'); commit; select '+' || code || '-' || round(dbms_random.value(10,99)) || '-' || round(DBMS_RANDOM.VALUE(100000,999999)) as phone_number from t_country_codes where key=(select round(DBMS_RANDOM.VALUE(1,5)) from dual); /* PHONE_NUMBER ------------ +31-54-732777 */
This post is taken from the paper i wrote for an internal use and enchanced for my blog post. You can download the paper from the following link if you have google docs access. Generating Random Data in Oracle 11g Database
Data companies are commonly associated with data center to purchase a product or service,and another have services Data Sales, B2B consumer data, Data Bureau suppliers service uk, Lead generation, Data lists, Data services for all information visit on www.ukdatahouse.com
ReplyDeleteThanks & Regards
Data Services
Really helpful information.. Thanks for sharing
ReplyDelete