To integrate Python or R scripts in Power BI for advanced statistical analysis, follow these best practices:
1. Enable Python or R in Power BI
- 
Go to: File > Options and settings > Options > Python scripting / R scripting
 
- 
Install Python (Anaconda recommended) or R with necessary packages:
- 
Python: pandas, numpy, scikit-learn, matplotlib, etc.
 
- 
R: ggplot2, dplyr, forecast, caret, etc.
 
 
- 
Set the correct Python/R home directory in Power BI settings.
 
2. Use Python or R in Power Query (Transform Data)
- 
In Power Query, select Transform Data > Run Python Script / Run R Script.
 
- 
Example for Python (Regression Analysis):
 
import pandas as pd
from sklearn.linear_model import LinearRegression
# Load Power BI dataset
dataset = dataset.dropna()  # Handle missing values
# Define model
X = dataset[['Feature1', 'Feature2']]
y = dataset['Target']
model = LinearRegression().fit(X, y)
# Predict and return results
dataset['Prediction'] = model.predict(X)
Example for R (Clustering Analysis):
library(dplyr)
library(cluster)
# Remove missing values
dataset <- na.omit(dataset)
# Perform k-means clustering
set.seed(123)
kmeans_result <- kmeans(dataset[, c("Feature1", "Feature2")], centers=3)
# Append cluster labels
dataset$Cluster <- as.factor(kmeans_result$cluster)
3. Visualize Results in Power BI
- 
Use Tables, Charts, or Custom Visuals to display Python/R-generated insights.
 
- 
For advanced visualizations:
 
import matplotlib.pyplot as plt
plt.plot(dataset['Date'], dataset['Prediction'], label="Forecast")
plt.legend()
plt.show()
 
4. Ensure Performance & Compatibility
- 
Avoid large datasets—Power BI limits script processing to 30 minutes and 500MB RAM.
 
- 
Run heavy computations externally (Azure ML, Databricks, SQL with ML Services).
 
- 
Schedule Power BI dataset refresh for periodic updates in Power BI Service.