AI Basics

Get started with artificial intelligence using Python and scikit-learn for machine learning.

1. Introduction to Artificial Intelligence

Artificial Intelligence (AI) enables machines to mimic human intelligence, performing tasks like decision-making and pattern recognition. Machine learning (ML), a subset of AI, allows systems to learn from data. This tutorial focuses on supervised learning, where models predict outcomes based on labeled data, using Python and scikit-learn.

Setting Up Your Environment

Install Python from python.org and verify:

python --version

Install scikit-learn and dependencies using pip:

pip install scikit-learn numpy pandas

Create a new Python file, e.g., `ai_model.py`, to start coding.

Understanding Supervised Learning

Supervised learning involves training a model on input-output pairs. For example, in a classification task, you predict categories (e.g., spam vs. not spam). We’ll use the Iris dataset, a classic dataset for classifying flower species based on measurements.

Load the Iris dataset and explore it:

from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
print(data.head())

This displays the first five rows of the dataset, showing features like sepal length and petal width.

Data Preprocessing

Data must be preprocessed for ML models. Split the Iris dataset into training and testing sets:

from sklearn.model_selection import train_test_split

X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

`test_size=0.2` means 20% of the data is reserved for testing, and `random_state=42` ensures reproducibility.

Building a Classification Model

Use a Decision Tree classifier to predict flower species:

from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

This trains the model on the training data and evaluates its accuracy on the test data, typically around 0.97 for Iris.

Practical Example: Predicting New Data

Test the model with a new sample (e.g., measurements for a new flower):

import numpy as np

new_flower = np.array([[5.0, 3.4, 1.5, 0.2]])  # Example measurements
prediction = model.predict(new_flower)
print(f"Predicted species: {iris.target_names[prediction[0]]}")

This outputs the predicted species (e.g., `setosa`). Adjust the input measurements to test different predictions.

Visualizing Results

Visualize feature importance to understand the model:

import matplotlib.pyplot as plt

importances = model.feature_importances_
plt.bar(iris.feature_names, importances)
plt.title("Feature Importance in Decision Tree")
plt.show()

This bar plot shows which features (e.g., petal length) most influence the model’s predictions.

Next Steps

Explore advanced ML algorithms like Random Forests or Neural Networks. Try other datasets from scikit-learn (e.g., digits, breast cancer). Deploy your model using Flask or FastAPI for web integration. For deeper AI, study deep learning with TensorFlow or PyTorch.

AI Basics Tutorial