Learn the fundamentals of artificial intelligence with hands-on Python and scikit-learn examples.
Get started with artificial intelligence using Python and scikit-learn for machine learning.
Artificial Intelligence (AI) enables machines to mimic human intelligence, performing tasks like decision-making and pattern recognition. Machine learning (ML), a subset of AI, allows systems to learn from data. This tutorial focuses on supervised learning, where models predict outcomes based on labeled data, using Python and scikit-learn.
Install Python from python.org and verify:
python --version
Install scikit-learn and dependencies using pip:
pip install scikit-learn numpy pandas
Create a new Python file, e.g., `ai_model.py`, to start coding.
Supervised learning involves training a model on input-output pairs. For example, in a classification task, you predict categories (e.g., spam vs. not spam). We’ll use the Iris dataset, a classic dataset for classifying flower species based on measurements.
Load the Iris dataset and explore it:
from sklearn.datasets import load_iris
import pandas as pd
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
print(data.head())
This displays the first five rows of the dataset, showing features like sepal length and petal width.
Data must be preprocessed for ML models. Split the Iris dataset into training and testing sets:
from sklearn.model_selection import train_test_split
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
`test_size=0.2` means 20% of the data is reserved for testing, and `random_state=42` ensures reproducibility.
Use a Decision Tree classifier to predict flower species:
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
This trains the model on the training data and evaluates its accuracy on the test data, typically around 0.97 for Iris.
Test the model with a new sample (e.g., measurements for a new flower):
import numpy as np
new_flower = np.array([[5.0, 3.4, 1.5, 0.2]]) # Example measurements
prediction = model.predict(new_flower)
print(f"Predicted species: {iris.target_names[prediction[0]]}")
This outputs the predicted species (e.g., `setosa`). Adjust the input measurements to test different predictions.
Visualize feature importance to understand the model:
import matplotlib.pyplot as plt
importances = model.feature_importances_
plt.bar(iris.feature_names, importances)
plt.title("Feature Importance in Decision Tree")
plt.show()
This bar plot shows which features (e.g., petal length) most influence the model’s predictions.
Explore advanced ML algorithms like Random Forests or Neural Networks. Try other datasets from scikit-learn (e.g., digits, breast cancer). Deploy your model using Flask or FastAPI for web integration. For deeper AI, study deep learning with TensorFlow or PyTorch.