Introduction
In our previous article, we discuss about how to run Python in SQL Server that hard coded. In this article we will discuss about the data which is already stored in our SQL Server Databases.
Steps to Process Python with SQL Server Data
In order to work with Data using Python, we have to follow these steps
- By Importing data using SELECT Statement (T-SQL Query)
- Process those data using Python code or using Python libraries
- Export the processed data in SQL Server Standard format
Importing Data using T-SQL
In the pervious article, we discussed about the parameters that are used in the System Stored Procedure sp_execute_external_script. We also discussed about the input optional parameter called @input_data_1. In this parameter we have to pass or write our standard T-SQL Query that selects data from our table(s).
The result of our T-SQL query is converted into Data frame and stored in variable name InputDataset.
Processing Data using Python code or Libraries
Now we have collected our required data. Now the next step is manipulate the data that is, the InputDataSet variable using Python code or Python libraries such as Pandas, the most widely used libraries. We no need to install this library, its already installed during our Python installation.
Exporting Our Data Frame
After manipulating the data, now we have to export our data back to SQL Server, which will represent the Data Frame in Standard SQL Server format, for that we can create another variable as OutputDataSet. Assign our manipulated data frame to this output variable.
When the script completes, the results will be displayed in Standard SQL Server Result Set by converting the data frame. That result can be used in report or just like any normal result set.
Sample Script
EXEC sp_execute_external_script @language = N'Python',
@script = N'
import pickle
irismodel = pickle.loads(nb_model)
species_pred = irismodel.predict(iris_data[[1,2,3,4]])
iris_data["PredictedSpecies"] = species_pred
OutputDataSet = iris_data.query( ''PredictedSpecies != SpeciesId'' )[[0, 5, 6]]
print(OutputDataSet)
'
, @input_data_1 = N'select id, "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "SpeciesId" from iris_data'
, @input_data_1_name = N'iris_data'
, @params = N'@nb_model varbinary(max)'
, @nb_model = @nb_model
WITH RESULT SETS ( ("id" int, "SpeciesId" int, "SpeciesId.Predicted" int));
Conclusion
In this simple article, we discussed how to process the dataset by importing from SQL Server Table. We will see more concepts on this in our upcoming articles. Please share your feedbacks in the comments section.