The Fake News Detector is a project aimed at identifying and flagging fake news articles. It utilizes machine learning algorithms to analyze the content of news articles and determine their credibility. The dataset used for this project is available on Kaggle and is licensed under Creative Commons 4.0.
- Analyze the content using natural language processing techniques.
- Generate a credibility score for the article.
- Flag articles as potentially fake or reliable.
- Generate meaningful visualizations.
- Clone the repository:
git clone https://github.com/fake-news-detector.git
- Ensure that you are using the specified python version is the .python-version file.
- Install the required dependencies:
pip install -r requirements.txt
- Download the two csv files from Kaggle.
- Create the directories ./data/raw and ./data/processed
- Place the two csv files in ./data/raw
To analyze the data using the Jupyter notebook 20240315-analyze-individual-datasets.ipynb
, follow these steps:
- Open the Jupyter notebook by running the command
jupyter notebook
in your terminal. - Navigate to the directory where the notebook is located:
./notebooks
. - Click on the
20240315-analyze-individual-datasets.ipynb
file to open it. - Run the notebook cell by cell to execute the code and analyze the individual datasets.
To create the training dataset using the Jupyter notebook preprocess.ipynb
, follow these steps:
- Click on the
20240315-preprocess-data.ipynb
file to open it. - Run the notebook cell by cell to execute the code and pre-process the data.
Note: Make sure to have the necessary dependencies installed before running the notebooks. Refer to the "Installation" section in the README for more information.
If you wish to train AI model run python fake_news_distilbert_model_main.py
in the ./models
directory.
- Hardware limitations may prevent the model from training as the code is written.
- You can adjust the code in
./models/train_model.py
accordingly.
Navigate to ./streamlit
and run streamlit run Home.py
to view the Streamlit application.
Contributions are encouraged. If you would like to contribute, please feel free to open an issue or create a pull request.