Computational Sciences - Master's theses
Permanent URI for this collectionhttps://laurentian.scholaris.ca/handle/10219/2096
Browse
Recent Submissions
Item Prediction and survival analysis of head and neck cancer in patients using epigenomics data and advanced machine learning methods(2023-08-22) Chaudhary, VikaskumarEpigenomics is the field of biology dealing with modifications of the phenotype that do not cause any alteration in the sequence of cell DNA. Epigenomics adds something to the top of DNA to change the properties, which eventually prohibits certain DNA behavior from being performed. Such modifications occur in cancer cells and are the sole cause of cancer. The main objective of this research is to perform prediction and survival analysis of Head and Neck Squamous Cell Carcinoma (HNSCC) which is one of the biggest reasons of death and accounts for more than 650,000 cases and 330,000 deaths annually worldwide. Tobacco use, alcohol consumption, Human Papillomavirus (HPV) infection (for oropharyngeal cancer), and Epstein- Barr Virus (EBV) infection are the main risk factors associated with head and neck cancer (for nasopharyngeal cancer). Males, with a proportion ranging from 2:1 to 4:1, are slightly more affected than females. Four different types of data are used in this research to predict HNSCC in patients. The data includes methylation, histone, human genome and RNA-Sequences. The data is accessed through open-source technologies in R and Python programming languages. The data is processed to create features and with the help of statistical analysis and advanced machine learning techniques, the prediction of HNSCC is obtained from the fine-tuned model. The optimal model was determined to be ResNet50 utilizing the Sobel feature selection method for image data and ReliefF-based feature selection for clinical features, achieving a test accuracy of 97.9%. The model's precision score was 0.929, its recall score was 0.930, and its F1 score was 0.930. Additionally, the ResNet101 model demonstrated the best performance using the Histogram of Gradients feature selection method for image data and mutual information-based feature selection for clinical features, yielding a test accuracy of 96.1%. Its precision score, recall score, and F1 score were identical to the aforementioned ResNet50 model. The research also utilized Kaplan- Meier survival analysis to investigate the survival rates of patients based on various factors, including age, gender, smoking status, tumor size, and location of site. The results obtained from this analysis yielded the effectiveness of the method in providing valuable insights for risk assessment.Item Supporting pairwise comparisons method by internet services(2023-06-28) Xue, SongwenThe pairwise comparisons method helps us with decision-making that involves multiple criteria. The consistency-driven pairwise comparisons method has been proven to be especially helpful with inconsistent data. By combining the pairwise comparisons method with nowadays’ advanced web technology, the decision-making process is simplified. This thesis integrates the theory of the pairwise comparisons method, and Koczkodaj’s inconsistency measurement, and reduction algorithms into an online implementation. The implementation utilizes JavaScript and its technologies, CSS, and HTML. It also uses version control technologies like GitHub and GitHub Pages. GitHub is a platform and cloud-based service for software development and version control. GitHub pages provide online access to develop software.Item RoBERTa: a machine reading comprehension for climate change question answering in natural language processing(2023-06-27) Mohasina , MohasinaWith the advancement of artificial intelligence technology in various domains in the past few years, the question-answering system has brought important changes to the knowledge acquisition process. When compared to the conventional retrieval question-answering system, the question- answering system that uses machine reading comprehension can provide short and accurate answers. This thesis proposes an intelligent question-answering approach based on information retrieval and machine reading comprehension. To begin, a two-stage information retrieval approach is developed. To produce a contextualized vector representation, the first dense vector technique (SRoBERTa) is utilized to roughly collect relevant climate change material. Second, an algorithm (DPR) is used, which precisely uses and organizes related paragraphs in order to obtain replies to the paragraph that are incredibly relevant to the question at hand. The model is then improved using a mechanism known as RoBERTa during the machine reading comprehension stage. It is carried out by utilizing these texts and then looking for concise and to-the-point solutions. When compared to other common methods, the results of information retrieval and reading comprehension show that the models developed in this study perform well.Item Super-resolution image reconstruction from multiple low-resolution images(2023-06-29) Vavadiya, BhavinkumarThis study explores a novel approach for super-resolution image reconstruction from multiple low-resolution images, employing frequency domain motion estimation technique (FMT), Keren-based image interpolation, and bicubic interpolation (BI). The method performs well in estimating scaling parameters, but accuracy decreases as shift distance or rotation angle increases. Compared to Vandewalle's algorithm, the proposed method shows better accuracy in estimating scaling parameters but similar accuracy for rotation and translation parameters. Differences are observed in estimated values for each parameter between both methods. The study underscores the need for further research to improve the accuracy of the proposed method in motion estimation and interpolation optimization. Additionally, Generative Adversarial Networks (GANs) outperform Bicubic and Wavelet Domain Super-Resolution (WDSR) algorithms in image quality improvement, indicated by higher Peak Signal-to-Noise Ratio (PSNR) values. This superior performance is attributed to GANs' ability to leverage deep learning algorithms to capture complex image features. The research validates the potential of the proposed method for super-resolution image reconstruction, and the power of deep learning-based algorithms, specifically GANs, in enhancing low-resolution images. More advanced motion estimation algorithms and interpolation technique optimization could further improve the accuracy of this method.Item Detecting image forgery over social media using U-NET with grasshopper optimization(2023-05-18) Ghannad, NioushaCurrently, video and digital images possess extensive utility, ranging from recreational and social media purposes to verification, military operations, legal proceedings, and penalization. The enhancement mechanisms of this medium have undergone significant advancements, rendering them more accessible and widely available to a larger population. Consequently, this has facilitated the ease with which counterfeiters can manipulate images. Convolutional neural networks (CNN)- based feature extraction and detection techniques were used to carry out this task, which aims to identify the variations in image features between modified and non-manipulated areas. However, the effectiveness of the existing detection method could be more efficient. This thesis introduces a segmentation method to identify the forgery region in the images with the U-Net model's improved structure. The suggested model connects the encoder and decoder pipeline by improving the convolution module and increasing the set of weights in the U-Net contraction and expansion path. In addition, the parameters of U-Net network are optimized by using the grasshopper algorithm. Experiments were carried out on the publicly accessible image tempering detection evaluation dataset from the Chinese Academy of Sciences Institute of Automation (CASIA) to assess the efficacy of the suggested strategy. The results show that the U-Net modifications significantly improve the overall segmentation results compared to other models. The effectiveness of this method was evaluated on CASIA, and the quantitative results obtained based on accuracy, precision, recall, and F1 score demonstrate the superiority of the U-Net modifications over other models.Item Detecting fake accounts on Instagram using machine learning and hybrid optimization algorithms(2023-05-11) Azami, PegahIn this thesis, we propose a hybrid method for detecting fake accounts on Instagram by using the Binary Grey Wolf Optimization Algorithm (BGWO) and Particle Swarm Optimization (PSO). By combining these two algorithms, we aim to leverage their complementary strengths and enhance the overall optimization performance. We evaluate the proposed hybrid method by using four classifiers, Support Vector Machine (SVM), Logistic Regression (LR), K-Nearest Neighbor (KNN), and Artificial Neural Network (ANN). The dataset used for the experiments contains 65329 Instagram accounts. We extract features from each account, including profile information, posting behavior, and engagement metrics by feature selection using PSO. The results show that the Hybrid optimization method (BGWOPSO) significantly outperformed both Binary Grey Wolf Optimization (BGWO) and Particle Swarm Optimization (PSO) methods when using several performance measures including accuracy, precision, recall, and AUC selecting the best optimal features.Item Prediction of drug targets for pancreatic cancer using machine learning techniques(2023-05-16) Patel, Manasvi GirishbhaiPancreatic cancer is one of the deadliest cancers with a very low survival rate. However, people who are diagnosed early have much longer survival than the ones who are not diagnosed with early screening. Therefore, the importance of early diagnosis and consequently, the treatment of pancreatic cancer can be understood. As pancreatic cancer is rare, early screening for pancreatic cancer is extremely costly. Research has been going on to find such techniques that can detect and hence diagnose pancreatic cancer early through Machine Learning models and use them even for the prediction of survival, Immunotherapy response, risk of re-occurrence, etc. The successful implementation of this technology in the prediction of the presence of pancreatic cancer is a breakthrough as it will greatly increase the survival rate as well as the life expectancy of such patients. One of the major challenges in the treatment of pancreatic cancer is the lack of specific and effective drug targets. In recent years, advances in our understanding of the biology of pancreatic cancer have led to the identification of several potential drug targets, including oncogenic signaling pathways, and cellular metabolism. Pancreatic cancer cells are highly metabolic, relying on glycolysis and the citric acid cycle to generate energy. Inhibiting these metabolic pathways has been shown to reduce the growth and survival of pancreatic cancer cells in preclinical studies. Pan-Cancer dataset from Genomics of Drug Sensitivity in Cancer (GDSC) was used in this research to predict drug targets. In this study Machine Learning algorithms were used such as feature importance using Random Forest, prediction of Drug Targets using Bagging, Dense Neural Network, Naïve Bayes, Multilayer Perceptron, K-Nearest Neighbors, Support Vector Machines, Long Short-Term Memory, Recurrent Neural Network and XGBoost classifier.Item Development of a novel, multidisciplinary, computer-based disaster simulation training tool for code orange event in northern Ontario(2024-09-16) Muller-Hartle, AardenSerious games utilize the virtual medium of video games to increase player knowledge. Within this thesis, we created a serious game to utilize for training hospital staff for emergency (Code Orange) events. The process of creating this game, Safety Simulator, is described in detail, as we present various learning theories and models, taking care to link how these were used to build our serious game. Through analysis of 5-point Likert scale and supplementary data, it was determined that participants have a positive feeling overall towards Safety Simulator, and elements such as choosing to play the tutorial can impact participant’s ability to feel positively towards the learning elements of the game. We found that training factor had a calculated Chi-value of 0.03 for tutorial elements and participant’s opinion on learning, indicating there is a connection between these elements.Item The problematic internet use in Pakistan(2021-06-25) Hassan, HammadThe revolution of smart devices, and smart objects has dramatically improved the usability of the Internet around the world. It’s happening, and over the last few decades, we’ve seen such a dynamic the Internet trend. The increasing use of the Internet today has played a very significant role in the compulsion of the Internet. It can affect the educational, psychological, medical, and social well- being of the user. The Internet restriction in developing countries like Pakistan are becoming more severe, and the public is not fully aware of the Internet usage. The current COVID-19 and subsequent lockdowns situation have further raised the level of the Internet coercion in developing countries like Pakistan. This treatise explores existing literature, social dilemmas, and problematic Internet use in Pakistan. We look forward to formally analyzing the literature and conducting pilot studies to make further contributions to this issue.Item Outcome-based judgement categorization of the Supreme Court of Canada(2022-09-12) Malley, ThomasOutcome-based judgement categorization of the Supreme Court of Canada (SCC) focuses on the multidisciplinary field of computational law. Regarding court hierarchy, the SCC is the highest court in Canada. Decisions from this court generally bind any lower court. Since court decisions are in a textual format, it is possible to correctly categorize outcomes of the SCC utilizing Natural Language Processing (NLP) techniques. The experiment contained shows algorithmic categorization performance F1 greater than 60. This result is significant given the binary nature of case outcomes (allow, dismiss) that an individual unfamiliar with the law should be able to guess 50% of the time correctly. This work is a preliminary study of future work to indicate the possibility of outcome forecasting in the judicial branch of the government.Item Evaluation of U-Net model in the detection of cervical spine fractures(2023-10-18) Kheirandish, FaranakThe cervical spine is composed of seven vertebrae from C1 to C7 with a lordotic curve (C-shaped curve) and joints between vertebrae for spine mobility. A computed tomography (CT) is commonly used by experts and physicians in imaging diagnosis to give information about the cervical spine and vertebrae in the neck. Diseases such as spinal stenosis (narrowing of the spinal canal), herniated discs, tumors, and fractures in the cervical spine can be diagnosed by CT scans. Quickly detecting the presence, and location of cervical spine fractures in CT scans helps physicians prevent neurologic deterioration and paralysis after trauma. Throughout this thesis, a U-Net model was trained for semantic segmentation on approximately 2019 study instances with provided CT images, while only 87 of them have been segmented by spine radiology specialists. After that, a combination of 2D CNN and bidirectional GRU deep learning models was used for the detection of fractures in each vertebra, as a classification task. The objectives of this research are to develop two deep-learning models for detecting and localizing cervical spine fractures and evaluate the ongoing research activities on semantic segmentation and classification in the medical field. This research aims to use a semantic segmentation algorithm in deep learning by using U-Net architecture to estimate the location of each cervical vertebra, as well as propose a deep convolutional neural network (DCNN) with a bidirectional GRU memory (Bi-GRU) layer for the automated detection of cervical spine fractures in CT images. This approach was trained and tested on a dataset provided by RSNA (a team of the American Society of Neuroradiology and Spine Radiology). Furthermore, the critical factors, such as preprocessing techniques and specialized loss functions were explored that must be taken into consideration when segmenting 3D medical images. Whether used as a standalone framework for segmentation and classification tasks or as an integrated backbone for medical image processing, this architecture is flexible enough to accommodate other models. The proposed approach yields results that are comparable to those of existing techniques, but it can be improved by using larger image sizes and more advanced GPU workstations that will reduce the overall processing time. Future research will be using other pretrained networks as an encoder and increasing image sizes to examin the performance improvmet of the architecture which needed more advanced computational resources and also integrate the current architecture into a simulated crash scenarios to use in various applications such as producing protecive sport equipments.Item Emotion-centric image captioning using a self-critical mean teacher learning approach(2022-11-07) Yousefi, AryanImage Captioning is the multi-modal task of automatically generating natural language descriptions based on a visual input using various Deep Learning techniques. This research area is in the intersection of Computer Vision and Natural Language Processing fields, and it has gained an increasing popularity over the past few years. Image Captioning is an important part of scene understanding with various extensive applications, such as helping visually impaired people, recommendations in editing applications, and usage in virtual assistants. However, most of the previous work in this topic has been focused on purely objective content-based descriptions of the image scenes. The goal of this thesis is to generate more engaging captions by leveraging humanlike emotional responses in the captioning process. To achieve this task, a Mean Teacher Learningbased method has been applied on the recently introduced ArtEmis dataset. This method includes a self distillation relationship between the memory-augmented language models with meshed connectivity, which will be first trained in a cross-entropy based phase, and then fine-tuned in a Self-Critical Sequence Training phase. In addition, we propose a novel classification module by decreasing texture bias and encouraging the model towards a shape-based classification. We also propose a method to utilize extra emotional supervision signals in the caption generation process, leveraging the image-to-emotion classifier. Comparing with the state-of-the-art results on ArtEmis dataset, our proposed model outperforms the current benchmark significantly in multiple popular evaluation metrics, such as BLEU, METEOR, ROUGE-L, and CIDErItem Biocybernetic closed-loop system to improve engagement in video games using electroencephalography(2022-01-06) Klaassen, StefanThe purpose of this paper was to determine the level of engagement with a specific stimuli while playing video games. The modern video game industry has a large and wide audience and is therefore becoming more popular and accessible to the public. The interactions and rewards offered in video games are a key to keep player engagement high. Understanding the player’s brain and how it reacts to different type of stimuli would help to continue improving games and advance the industry into a new era. Although studying human engagement had started many years ago, the application of measuring it in video game players has only been applied more recently and is still an evolving field of research. This thesis will be taking an objective approach by measuring engagement through electroencephalogram (EEG) readings and seeing if it will help improve current dynamic difficulty adjustment (DDA) systems for video games leading to more engaging and entertaining games. Although statistically significant findings were not found in this experiment, the technique for future experiments were laid out in the form of classifiers comparison and program layouts.Item Crop disease detection using deep learning techniques on images(2023-05-23) Deputy, Kinjal VijaybhaiAgriculture is a field which is referred to as the main sector for the development of the economy in various countries, and it is also providing food to the large population of the world despite various limitations and boundaries. Food security is threatened by several factors including climate change, the decline in pollinators, plant diseases and others. Different efforts have been developed to prevent crop loss due to infections in the plants. The advancement in technology is helping farmers in developing different systems that can help in reducing the problem. Smartphones specifically offer very novel ways to identify diseases because of their computing power, high resolution displays, and extensive built-in sets of accessories, such as advanced HD cameras. This leads to a situation where disease diagnosis based on automated image recognition is needed. Image recognition is made possible by applying a deep learning approach. So the research is aimed to analyze deep learning-based image detection techniques to identify the various diseases in the plants. The “PlantVillage” dataset has been used to train models. Deep learning Architectures such as AlexNet and GoogleNet, ResNet50 and InceptionV3 are used. Two approaches are used to train the model: ‘training from scratch’ and ‘transfer learning’. It was found from the results of the primary analysis that the GoogleNet leaves behind the AlexNet, ResNet50 and InceptionV3 in training from scratch approach. And ResNet50 performed best in transfer learning.Item Optimal data allocation method considering privacy enhancement using E-CARGO(2023-04-19) Peng, ChengyuWith the rise in popularity of cloud computing, there is a growing trend toward the storage of data in a cloud environment. However, there is a significant increase in the risk of privacy information leakage, and users could face serious challenges as a result of data leakage. In this paper, we propose an allocation scheme for the storage of data in a collaborative edge-cloud environment, with a focus on enhanced data privacy. In addition, we explore an extended application of the approach to sourcing. Specifically, we first evaluate the datasets and servers. We then introduce several constraints and use the Environments-Classes, Agents, Roles, Groups, and Objects (E-CARGO) model to formalize the problem. Based on the qualification value, we can find the optimal allocation using the IBM ILOG CPLEX Optimization (CPLEX) Package. At a given scale, the allocation scheme scores based on our method improve by about 50% compared to the baseline method and the trust-based method. Moreover, we use a similar approach to analyze procurement issues in the supply chain to help companies reduce the carbon emissions. This shows that our proposed solution can store data in servers that better suit their requirements and is adaptable to other problems.Item Detecting span in emails using advanced machine learning methods(2022-06-03) Mistry, NiraliE-mail is one of the quickest and most professional ways to send messages from one location to another around the world; however, increased use of e-mail has increased to received messages in the mailbox, where the recipient receives a large number of messages, some of which cause significant and varied problems, such as the theft of the recipient's identity, the loss of vital information, and network damage. These communications are so harmful that the user has no way of avoiding them, especially when they come in a variety of forms, such as adverts and other types of messages. Spam is the term for these emails Filtering is used to delete these spam communications and prevent them from being viewed. This research intends to improve e-mail spam filtering by proposing a single objective evaluation algorithm issue that uses Deep Learning, Genetic Algorithms, and Bayes theorem-based classifiers to build the optimal model for accurately categorizing e-mail messages. Text cleaning and feature selection are used as the initial stage in the modeling process to minimize the dimension of sparse text features obtained from spam and ham communications. The feature selection is used to choose the best features, and the third stage is to identify spam using a Genetic algorithm classifier, Support Vector Machine, Bayesian Classifier, Nave Bayes, SVM, Random Forest, and Long-Short Term Memory classifier.Item BERT-based multi-task learning for aspect-based sentiment analysis(2022-01-20) Bhagat, YeshaThe Aspect Based Sentiment Analysis (ABSA) systems aims to extract the aspect terms (e.g., pizza, staff member), Opinion terms (e.g., good, delicious), and their polarities (e.g., Positive, Negative, and Neutral), which can help the customers and companies to identify product weaknesses. By solving these product weaknesses, companies can enhance customer satisfaction, increase sales, and boost revenues. There are several approaches to perform the ABSA tasks, such as classification, clustering, and association rule mining. In this research we have used a neural network-based classification approach. The most prominent neural network-based methods to perform ABSA tasks include BERT-based approaches, such as BERT-PT and BAT. These approaches build separate models to complete each ABSA subtasks, such as aspect term extraction (e.g., pizza, staff member) and aspect sentiment classification. Furthermore, both approaches use different training algorithms, such as Post-Training and Adversarial Training. Moreover, they do not consider the subtask of Opinion Term Extraction. This thesis proposes a new system for ABSA, called BERT-ABSA, which uses MultiTask Learning (MTL) approach and differentiates from these previous approaches by solving all three tasks such as aspect terms, opinion terms extraction, and aspect term related sentiment detection simultaneously by taking advantage of similarities between tasks and enhancing the model’s accuracy as well as reduce the training time. To evaluate our model’s performance, we have used the SemEval-14 task 4 restaurant datasets. Our model outperforms previous models in several ABOM tasks, and the experimental results support its validityItem Pairwise comparisons and visual perceptions of 3D shape volume estimation(2022-05-31) Wan, WenjunUsing pairwise comparisons for estimations increases accuracy. At present, scholars use the pairwise comparisons method to make subjective comparison between one-dimensional image and two-dimensional images. This research is about the subjective comparison of three-dimensional images. We rst sets a xed object volume and then uses the random method to generate multiple three-dimensional objects with di erent shapes and then scale them to our designed volume values. This study also virtualizes and binarizes the image and prints the actual object in the way of 3D printing for respondents to observe. Thirty-two respondents used the direct and pairwise comparisons methods to rate the volume of ve randomly generated 3D shapes. It is found that using the direct method, the observer's estimation errors is higher (in average) than when the paired comparisons method is used. The pairwise comparisons method can improve the accuracy of estimating the volume of random objects.Item Social media hate speech detection using explainable AI(2022-05-25) Mehta, HarshkumarArtificial Intelligence has invaded various fields in the present times. Be it science, education, finance, business or social media, Artificial Intelligence has found its applications everywhere. But currently, AI is limited to only its subset ‘Machine Learning’ and has not even realized its full potential. In machine learning, in contrast to traditional programming which requires writing algorithms, it is required to find the algorithm that learns patterns from a given dataset and builds a predictive model and the computer learns the patterns between input and output based on that. However, a key impediment of current AI-based systems is that they often lack transparency. The current AI systems have adopted a black box nature which allows powerful predictions, but these predictions cannot be explained directly. To gain human trust and increase transparency of AIbased systems, many researchers think that Explainable AI is the way forward. In today’s era, an enormous part of human communication takes place over digital platforms, for example, through social media platforms and so does hate speech, which are dangerous for an individual person as well as the society. These days automated hate speech detection is built on social media platforms such as Twitter, Facebook, etc. using machine learning approaches. Deep learning models attain a high performance has low transparency due to complex models, which leads to “trade-off” between performance and explainability. Explainable Artificial Intelligence (XAI) was used to create black box approaches interpretable, without giving up on performance. These XAI methods provide explanations that can be translated by humans without having a depth of knowledge in deep learning models. XAI characteristics have flexible and multifaceted potential in the hate speech detection by the deep learning models. XAI thus provides a strong interconnection between an individual moderator and hate speech detection framework, which is a pivot for the research study in interactive machine learning. In the case of Twitter, the main tweets are detected for hate speech however retweets and replies are not detected for hate speech as there is no tool to handle the task to detect the hate speech for in progress conversations. Interpreting and explaining decisions made by complex AI models to understand the decision-making process of these model is the aim of this research. While machine learning models are being developed to detect the hate speech on social media, these models lack the interpretability and transparency on the decisions made. Traditional machine learning models achieve high performance at the cost of interpretability and explaining model decisions. The main objectives of this research are, to review and present a comparison of various techniques used in Explainable Artificial Intelligence (XAI), to present a novel approach for hate speech classification using Explainable Artificial Intelligence (XAI) and, to achieve a good trade-off between precision and recall for the method proposed. Explainable AI models for hate speech detection will help social media moderators and any other users for these models to not only see but also study and understand how the decisions are made and how the inputs are mapped to the output. As a part of this research study, two data sets were taken to demonstrate Hate Speech Detection using Explainable Artificial Intelligence (XAI). Data preprocessing was performed to remove any bias, clean data of any inconsistencies, clean the text of the tweets, tokenize, and lemmatize the text, etc. Categorical variables were also simplified in order to generate a clean dataset for training purposes. Exploratory data analysis was performed on the data sets to uncover various patterns and insights. Various pre-existent models were applied to the Google Jigsaw dataset such as Decision Trees, K-Nearest Neighbours, Multinomial Naïve Bayes, Random Forest, Logistic Regression, and Long Short-Term Memory (LSTM) out of which LSTM achieved an accuracy of 97.6%, which is an improvement compared to the studies of Risch et al. (2020). Explainable method like LIME (Local Interpretable Model-Agnostic Explanations) is applied on HateXplain dataset. Variants of BERT (Bidirectional Encoder Representations from Transformers) model like BERT + ANN (Artificial Neural Networks) and BERT + MLP (Multilayer Perceptron) were created to achieve a good performance in terms of explainability using ERASER (Evaluating Rationales and Simple English Reasoning) benchmark by DeYoung et al. (2019) where in BERT + ANN achieved better performance in terms of explainability as compared to the study by Mathew et al. (2020).Item Analyzing impact on bitcoin prices through Twitter social media sentiments(2022-04-28) Patel, JayMany cryptocurrencies exist in today's date, and many more are on the verge of being brought into circulation. It is a form of a digital currency but instead of being run by a centralized authority and government, it is a decentralized structure that is created using blockchain technology. These currencies are highly influential and unpredictable with their factors of influence ranging high and low all over the world. This research revolves around the most well-renowned cryptocurrency which is Bitcoin. The focus here is on the discussion around the relationship of bitcoin with the prominent online media platform called Twitter. Twitter has been taking part in the discussion of almost all major as well as related incidents and events all around the world. It is a social media platform that is informative as well as useful for the public so much, that even major personalities, as well as politicians, take to the platform in order to express their views quickly on an important matter. The research included firstly gathering the tweets and was divided into two parts - Verified and Non-Verified users and then a cleaning process was done on the data to make sure that only the desired and necessary data if left for further research. The tweets regarding bitcoin were analyzed and utilized for a deeper observation so that the sentiment can be extracted and can be visualized against the bitcoin prices to derive a conclusion regarding the relationship between Twitter and Bitcoin prices. The analysis returned a lot of insights as well as inference relating to the influence that the Bitcoin prices and related tweets have on each other. The results of the report mention the outcome of the analysis that was found stating the original hypothesis to be true or not