Posts

Showing posts with the label Final year project

Phishing Website Detector

  This final-year Computer Science project develops a Phishing Website Detector to identify malicious URLs that attempt to steal sensitive user information, such as login credentials or financial details. The system combines machine learning techniques and rule-based logic to analyze URL features, achieving high accuracy in classifying URLs as phishing or legitimate. Built in Python, it utilizes Scikit-learn for training machine learning models like Random Forest or Logistic Regression, and Flask for deploying a web-based interface where users can input URLs for real-time detection. Key skills include cybersecurity for understanding phishing techniques, machine learning for model development, and web development for deployment. The system processes datasets like the UCI Phishing Sites dataset, extracting features such as URL length, special characters, and domain properties. It achieves over 90% accuracy on test sets, evaluated using metrics like precision, recall, and F1-score. Th...

Sentiment Analysis of Social Media Posts

  Abstract This final-year Computer Science project develops a Sentiment Analysis system for social media posts, focusing on analyzing tweets (now X posts) or similar content to classify public sentiment as positive, negative, or neutral. The system employs Natural Language Processing (NLP) techniques to preprocess text, machine learning algorithms for classification, and data scraping methods to collect real-time or historical posts. Built in Python, it uses NLTK for NLP tasks like tokenization, stemming, and sentiment scoring; BeautifulSoup for web scraping to gather posts from accessible platforms (e.g., Reddit or public forums, noting limitations on X due to API restrictions as of 2025); and scikit-learn (integrated via machine learning skills) for training classifiers like Naive Bayes or SVM. The project processes datasets such as the Twitter Sentiment Analysis dataset or scraped data, achieving an accuracy of over 75-85% on test sets through techniques like TF-IDF vectorizat...

Stock Price Prediction Using LSTM Models

  Abstract This final-year Computer Science project develops a Stock Price Prediction system that forecasts future stock market trends based on historical data using Long Short-Term Memory (LSTM) networks, a type of recurrent neural network (RNN) suited for time-series analysis. The system processes historical stock data (e.g., open, high, low, close prices, and volume) from sources like Yahoo Finance, performs feature engineering with Pandas, and trains an LSTM model in TensorFlow to predict closing prices. Key skills include time-series analysis for handling sequential data, trends, and seasonality, and deep learning for building and optimizing neural networks. Tools utilized are TensorFlow for model implementation, Pandas for data manipulation and preprocessing, and Jupyter Notebook for interactive development, visualization, and experimentation. The model achieves a Mean Squared Error (MSE) of approximately 0.01-0.05 on normalized test data, demonstrating reasonable accuracy f...

Movie Recommendation System

  Abstract This project develops a Movie Recommendation System that suggests movies to users based on their preferences and historical ratings using collaborative filtering techniques. The system employs machine learning algorithms to analyze user-item interactions, identifying similar users or items to generate personalized recommendations. Built primarily in Python, it utilizes Pandas for data manipulation and Scikit-learn for implementing recommendation algorithms like K-Nearest Neighbors (KNN) for user-based filtering. The system processes large datasets, such as the MovieLens dataset, to compute similarity matrices and predict ratings for unseen movies. Key skills include machine learning for model training and evaluation, and recommendation algorithms for handling sparse data. The project achieves recommendation accuracy measured by metrics like Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), typically below 0.9 on test sets. Deployed as a command-line or web-ba...