PTR logo

Blog Post

SQL Server 2016 - R Service

SQL Server 2016 - R Service
MD

Author

Mandy Doward

Date

September 29, 2016

Length

3 mins

One More Language to Learn!

So, we have had SQL, TSQL, MDX, DMX, DAX to keep us busy in the SQL Server world for years now so I guess it was about time another language came along to join them!

SQL Server 2016 enters the world of Big Data with its PolyBase and R Services.

The SQL Server R Service

The R Service enables integration of R scripts into TSQl scripts. 

What is R?

Well this is what you get if you take a look at Wiki:

It is an Open Source language for staistical analysis and graphics, and is commonly used by statisticians and data miners is what it says here.

The R Project

R is free and downloadable from the GNU R Project website: https://www.r-project.org/

As well as providing libraries/packages that enable complex analysis to be carried out on data sets, there are many libraries/packages that assist in cleansing data before it can be analysed.

SQL Server & R Scripting

SQL Server 2016 has introduced the R service that provides stored procedures to allow R scripts to be executed from Transact SQL scripts. The following example uses the new WideWorldImporters sample database and shows how an R regular expression library called stringr can be used to employ powerful regular expressions to retrieve values from poorly structured relational data:

The sp_execute_external_script stored procedure is called specifying the R language (@language variable) as the scripting language to be used.

The @script variable contains the actual R script to be executed.

The R script firstly defines variables that will be use:

The R unlist function is used to extract the values from all records for the column aliased as Street in the Input Data Object called Customer_AddressLine2.

The same is done to extract the customer name values.

The housenumber variable is a regular expression that matching a string of consecutive digits 1 to 9 (1 ore more) immediately followed by 0 or 1 lowercase letters.

hnumber is set to the result of the str_match function, a member of the stringr library, which extracts just the matching string for the regular expression in housenumber.

The results are returned via the OutputDataSet variable which in turn is populated by the data.frame function with the threee values of interest for the results set.

As can be seen from the screen shot above, the result is that the building number is extracted from the street line of the customer address.

Here is the full script for you:

Did you find this interesting? Would you like to find out more? Drop us a line at info@ptr.co.uk.

Share This Page

MD

Mandy Doward

Managing Director

PTR’s owner and Managing Director is a Microsoft MCSE certified Business Intelligence (BI) Consultant, with over 30 years of experience working with data analytics and BI.

Latest Articles

PTR FAQs

See our FAQs or get in contact for more information on any of our services, solutions, or to start your PTR journey.

Ready to take your business to the next level?

Reach out to our team of experts and learn more about our consultancy and training services.