Creating my Fake News Detector

Rahul Bhattacharya Aug 23, 2024 07:34 AM

Live Demo GitHub

Synopsis: I once stumbled upon an online article that looked reliable at first. The formatting was polished and the headline was striking. Yet as I kept reading, some statements did not align with facts I already knew. That personal experience inspired me to imagine a tool that could highlight questionable claims....

I once stumbled upon an online article that looked reliable at first. The formatting was polished and the headline was striking. Yet as I kept reading, some statements did not align with facts I already knew. That personal experience inspired me to imagine a tool that could highlight questionable claims. I decided to create a Fake News Detector using Streamlit and OpenAI.

The project became a way to combine a lightweight user interface with the reasoning of language models. I structured the repository with a main Python script, a dataset of seed examples, and configuration files. The rest of this post walks step by step through every block of code. Each code snippet is followed by a clear explanation of what it does and how it fits in.

requirements.txt

streamlit
openai
pandas
numpy

This file lists the essential dependencies. Streamlit builds the interactive web application. OpenAI provides access to large language models and embeddings. Pandas and NumPy support data handling and numeric operations. Keeping the list short ensures installation is straightforward.

README.md

# Fake News Detector
A simple web app to detect potential fake news using OpenAI models and Streamlit.

The README introduces the project briefly. It makes clear that this is a Streamlit app powered by OpenAI models. Such a minimal description is enough for someone browsing GitHub to understand the intent. Documentation, even if short, is a sign of a usable repository.

data/seed_examples_fake_news.csv

This CSV file stores labeled examples of fake and real news. These examples serve as a reference when comparing embeddings. By having some initial labeled cases, the app can provide more consistent classification. Even a small dataset can make outputs steadier across runs.

app.py Detailed Walkthrough

Code Block 1

import json
import math
from typing import Tuple, Dict, List

import numpy as np
import pandas as pd
import streamlit as st
from openai import OpenAI


# ---------- Page + globals ----------
st.set_page_config(page_title="Fake News Detector", layout="centered")
st.title("Fake News Detector")

# OpenAI client from Streamlit secrets (set in Streamlit Cloud)
# Secrets UI keeps your key out of Git history.
# Docs: https://platform.openai.com/docs/api-reference  and Streamlit secrets docs.
client = OpenAI(api_key=st.secrets["OPENAI_API_KEY"])

GPT_MODEL_DEFAULT = "gpt-4.1-mini"
EMBED_MODEL = "text-embedding-3-small"  # fast, inexpensive


# ---------- Utilities ----------

This block handles configuration or inline logic.

It sets up environment, constants, or streamlit UI elements.

Such code runs at the top level whenever the app executes.

It prepares the context so later functions can work without errors.