Ieee Latest 2018-19 Data Mining, ML, Big Data, Ai Java Projects

For: - B. E | B. Tech | M. E | M. Tech | MCA | BCA | Diploma |MS |M.
Sc |
IEEE
REAL TIME PROJECTS& TRAINING GUIDE
SOFTWARE & EMBEDDED
www.makefinalyearproject.com
PROJECTS TITLES FOR ACADEMIC YEAR 2018-2019

#19, MN Complex, 2nd Cross, Sampige Main Road, Malleswaram, Bangalore – 560003
Call Us: 9590544567 / 7019280372 www.makefinalyearproject.com
www.igeekstechnologies.com Land Mark: Opposite Joyalukkas Gold Showroom, Near to Mantri Mall
IEEE LATEST 2018-19 DATA MINING,ML,BIG DATA,AI JAVA PROJECTS
IGTJP001 Serendipitous Recommendation in E-Commerce Using Innovator-Based
Collaborative Filtering(IEEE-2018)
The rapid development of information technology has facilitated an
elegant trading environment in the Internet. There are many trading
platforms nowadays but there is no good platform designed for direct
consumer-to-consumer (C2C) trading primarily for university students, to buy
and sell their goods and services directly to other students within their
university or city. Such a need arises in a social network where items should
be traded or exchanged easily with a small community. The famous websites
such as Amazon or eBay are too global in nature and does not support the
direct trading of goods and services among the students in a small social
network such as a campus environment.
IGTJP002 Machine Learning and Deep Learning Method For Cyber Security(IEEE-
2018)
Cyber Security is that branch of Computer Technology that deals with

security in cyberspace. Cyberspace refers to the description of policies
regarding the networks and computer systems. The policies laid out in the
Cyber security are for the reason of avoiding the malicious activity or
unauthorized access to secured information. Since the emergence of high
structured networks, there arises a concern about how intelligently these
networks are secured. Cyber refers to something that can be done on
internet. Crime refers to something that is done illegally or without
authorization. All those crimes that are done on the internet in order to gain
access to secured information or authorization rights is termed as Cyber
Crime.
IGTJP003 Early Prediction of Chronic Kidney Disease Using Machine Learning

Supported by Predictive Analytics(IEEE-2018)
Chronic Kidney Disease (CKD) is one of the worldwide public health problems
due to the costly treatment of its end stage and high possibility of death. As
such, World Health Organization (WHO) has reported that South East Asia
and the Americas witness the highest annual rate (around 1.4%) of
population with this disease, from the comparison among six regions in 2012.
In Thailand, approximately 17. 5% of adult population is identified as having
CKD. Furthermore, the number of new patients increases yearly, while there
are some limitations of obtaining public health insurance; such as free or low
cost prescription, lack of the necessary medical equipment and medical
reimbursement limit. As the expense for dialysis is about 1,500 baht per
session and 4,500 baht per week, patients have to cover the expense over
medical reimbursement limit.
IGTJP004 Simplistic Approach to Detect Cybercrimes and Deter Cyber

Criminals(IEEE-2018)
Criminal minded’ informal conversations on social media (e.g. Twitter) shed

light into their educational experiences—opinions, feelings, and concerns
about the learning process. Data from such un-instrumented environments
can provide valuable knowledge to inform student learning. Analyzing such
data, however, can be challenging. The complexity of criminal minded’
experiences reflected from social media content requires human
interpretation. However, the growing scale of data demands automatic data
analysis techniques. In this project data mining algorithm based on Naïve
Bayes Multi-Label Classifier is implemented which contains several steps like
Data Collection from twitter, Cleaning the data by removing stop words,
removal of non letter and punctuation marks, probability of the words for
various categories namely Heavy Study Load, Sleep Problems, Lack of Social
Engagement, Negative Emotion and Diversity Issues is estimated. For all the
tweets Accuracy, Precision, Recall, F1 measure, Micro Averaged & Macro
Averaged values are computed for each category and also for the various
users. Therefore we can conclude on average how many criminal minded
have various categories of problems as well as extend this to the problems
faced by which user.
IGTJP005 Review Spam Detection using Machine Learning(IEEE-2018)
Prior to buying a product, people usually inform themselves by reading online

reviews. To make more profit sellers often try to fake user experience. As
customers are being deceived this way, recognizing and removing fake
reviews is of great importance. This paper analyzes spam detection methods,
based on machine learning, and presents their overview and results.
IGTJP006 A Subword-based Deep Learning Approach for Sentiment Analysis of

Political Tweets(IEEE-2018)
A growing phenomena like Online Social Networks (OSN) has a global impact
among web users; social sites aims to create, share and express information
on a real-time basis that can be useful to analyze in terms of data streaming
and opinion diffusion. Social platforms as Twitter1 became popular in recent
years by the fact that users can post in short messages, also referred as
tweets, with a maximum length of 140 characters, that allows to interact
with celebrities, politicians, opinion leaders and other users by following
their profile’s account. The amount of data generated by Twitter is
proportional to their expanding community, Twitter Company Facts2
reported that at January 30,2016 the number of active users increased to
313 millions, so the expectation of gathering data from different topics and
opinions related to political affairs is versatile. Political activities include a
vast range of categories, as much the social event has an eventual impact in
society, opinion flows in Twitter as an increasing trending, that involves
large portions of global users tweeting or sharing the targeted opinion, that
for most part, can be expressed, as a number of hashtags.
IGTJP007 An Internal Intrusion Detection and Protection System by Using Data

Mining and Forensic Techniques(IEEE-2018)
Currently, most computer systems use user IDs and passwords as the login
patterns to authenticate users. However, many people share their login
patterns with coworkers and request these coworkers to assist co-tasks,
thereby making the pattern as one of the weakest points of computer
security. Insider attackers, the valid users of a system who attack the system
internally, are hard to detect since most intrusion detection systems and
firewalls identify and isolate malicious behaviors launched from the outside
world of the system. Hence an audit based framework is required on valid
users to identify suspicious behavior done by other users using credentials of
the user.
IGTJP008 Corporate Communication Network and Stock Price Movements Insights
From Data Mining(IEEE-2018)
Data-mining algorithm to detect communication patterns with company

social media to determine any effect on stock market prices. Specifically, o
find out whether or not there exist any association relationships between
the frequency of tweets of the company and the performance of the
company as reflected in its stock prices. If such relationships do exist, we
would also like to know whether or not the company’s stock price could be
accurately predicted based on the detected relationships. To detect the
association relationships, a data-mining algorithm is proposed here to mine
e-mail communication records and historical stock prices so that based on
the detected relationship, rules that can predict changes in stock prices can
be constructed.
IGTJP009 Toward Truly Personal Chatbots -

ECOMMERCE_CHAT_BOT_APPLICATION(IEEE-2018)
The internet is expanding and lot of buyers is shifting from direct buying to
online purchases. There are two kinds of evolving models in the ecommerce
revolution namely Business to Consumer (B2C) and Business to Business
(B2B) . In B2C systems there are transactions that are performed directly by
the consumer on the merchant site and merchant manages the entire end to
end delivery of end product. In B2B systems the merchant has tie up with
Logistics Company which is responsible for delivery of product once the
product is sold online. The number of online products and ecommerce
websites are increasing exponentially due to which certainly there is a need
of recommendations system to be implemented on the ecommerce
application. In this project different categories of books are provided from
which the user can purchase the book for any of the category based on the
account no and ipin. The application also tracks the user visiting pages and
actions performed by the user. The project provides filters recommendations
to the end user based on the category, publisher and price range the user
needs using a set of questions based on artificial intelligence. The user will
chat with a artificial intelligence engine which measures the relationship
between the query and attributes of book based on semantic relations and
then recommend the products to the end user.
IGTJP010 Web Media and Stock Markets A Survey and Future Directions from a Big
Data Perspective(IEEE-2018)
In the proposed method social media texts contains a huge commercial

value, which using the emotional color analysis can help predict the movie
box office, conduct monitoring public opinion, to understand the user
experience. For implement in stock price predicting, the comments related
to the stocks and financials in necessary and enough. To realize the
emotional data fetching, a topical web crawler is developed and used to
fetch the stock comments after the closing of each trading day. In the
proposed approach the stock data is obtained from the social media.
Multiple features are considered the stock market related words and then
sentiment analysis are performed on the social media tweets and then
prediction is performed based on set of predefined data sets of previous
history and the social media index.
IGTJP011 DeepMovRS-A unified framework for deep learning-based movie

recommender systems(IEEE-2018)
The world wide web can be viewed as a repository of opinions from users
spread across various websites and networks, and today’s citizens look up
reviews and opinions to judge movies, visit forums to debate about Films and
acting. With this explosion in the volume of and reliance on user reviews and
opinions, directors and producers face the challenge of automating the
analysis of such big amounts of data (user reviews, opinions, sentiments).
Armed with these results, directors can enhance their Movie and tailor
experience for the customer. Similarly, policy makers can analyze these posts
to get instant and comprehensive feedback. This project is the outcome of
our research in gathering opinion and review data from popular portals,
movie websites, forums or social networks; and processing the data using the
rules of natural language and grammar to find out what exactly was being
talked about in the user's review and the sentiments that people are
expressing. Our approach diligently scans every line of data, and generates a
cogent summary of every review (categorized by aspects) along with various
graphical visualizations. A novel application of this approach is helping out
Movie directors in gauging response.
IGTJP012 Emotion Recognition on Twitter Comparative Study and Training a Unison

Model(IEEE-2018)
Despite recent successes of deep learning in many fields of natural

language processing, previous studies of emotion recognition on Twitter
mainly focused on the use of lexicons and simple classifiers on bag-of-words
models. The central question of our study is whether we can improve their
performance using deep learning. To this end, we exploit hash tags to create
three large emotion-labeled data sets corresponding to different
classifications of emotions. We then compare the performance of several
word and character-based recurrent and convolutional neural networks with
the performance on bag-of-words and latent semantic indexing models. We
also investigate the transferability of the final hidden state representations
between different classifications of emotions, and whether it is possible to
build a unison model for predicting all of them using a shared representation.
We show that recurrent neural networks, especially character-based ones,
can improve over bag-of-words and latent semantic indexing models.
Although the transfer capabilities of these models are poor, the newly
proposed training heuristic produces a unison model with performance
comparable to that of the three single models.
IGTJP013 Integrating StockTwits with Sentiment Analysis for better Prediction of

Stock Price Movement(IEEE-2018)
In this project lot of data mining algorithms Sequential analysis, Sequential

index, real time stock prediction and ranking of companies based on stock
market has been done and also analyzed. The entire sequence has an
innovative approach which helps the user to predict the companies in an
order of their stock market increase propagation.
IGTJP014 Emotion Recognition on Twitter Comparative Study and Training a

Unison Model(IEEE-2018)
Despite recent successes of deep learning in many fields of natural language

processing, previous studies of emotion recognition on Twitter mainly
focused on the use of lexicons and simple classifiers on bag-of-words
models. The central question of our study is whether we can improve their
performance using deep learning. To this end, we exploit hash tags to create
three large emotion-labeled data sets corresponding to different
classifications of emotions. We then compare the performance of several
word and character-based recurrent and convolutional neural networks
with the performance on bag-of-words and latent semantic indexing
models. We also investigate the transferability of the final hidden state
representations between different classifications of emotions, and whether
it is possible to build a unison model for predicting all of them using a shared
representation. We show that recurrent neural networks, especially
character-based ones, can improve over bag-of-words and latent semantic
indexing models. Although the transfer capabilities of these models are
poor, the newly proposed training heuristic produces a unison model with
performance comparable to that of the three single models.
IGTJP015 Automatically Identifying Themes and Trends in Software Engineering

Research(IEEE-2018)
In the project the current research themes within the Software Engineering
field is performed by identifying the key topics that academics are
researching and publishing today. Secondly it is to investigate the
application of Natural Language Processing (NLP) techniques to
automatically extract those themes by parsing the papers published thus
far in 2015, notably applying a clustering technique to identify the key
collections of papers.
IGTJP016 Accelerating Test Automation through a Domain Specific Language(IEEE-

2017)
The proposed approach makes use of Accelerating Test Automation

Platform (ATAP) which is aimed at making test automation accessible to
non-programmers. ATAP allows the creation of an automation test script
through a domain specific language based on English. The English-like test
scripts are automatically converted to machine executable code using
Selenium Web Driver. ATAP’s English-like test script makes it easy for non-
programmers to author. The functional flow of an ATAP script is easy to
understand as well thus making maintenance simpler.
IGTJP017 Analyzing Sentiments in One Go A Supervised Joint Topic Modeling

Approach(IEEE-2017)
The world wide web can be viewed as a repository of opinions from

users spread across various websites and networks, and today’s netizens
look up reviews and opinions to judge commodities, visit forums to debate
about events and policies. With this explosion in the volume of and reliance
on user reviews and opinions, manufacturers and retailers face the
challenge of automating the analysis of such big amounts of data (user
reviews, opinions, sentiments). Armed with these results, sellers can
enhance their product and tailor experience for the customer. Similarly,
policy makers can analyse these posts to get instant and comprehensive
feedback. Or use it for new ideas that democratize the policy making
process. This paper is the outcome of our research in gathering opinion and
review data from popular portals, e-commerce websites, forums or social
networks; and processing the data using the rules of natural language and
grammar to find out what exactly was being talked about in the user's
review and the sentiments that people are expressing. Our approach
diligently scans every line of data, and generates a cogent summary of every
review (categorized by aspects) along with various graphical visualizations.
A novel application of this approach is helping out product manufacturers
or the government in gauging response.
IGTJP018 Senti Review Sentiment Analysis based on Text and Emoticons(IEEE-

2017)
In the competitive world the number of companies is increasing day by day.

Each of the companies will say there product is good but the new buyer
does not have knowledge whether the given product is good or not until
he/she buys the products. This is a disadvantage for the consumers. Top
companies are providing a rating system on the basis of STARS where stars
are high means product is good. All these are based on the numerical scale.
Text mining is the most significant subject which can solve the issues.Text
mining is an approach in which a lot of quality information can be extracted
by performing the analysis on the data and then identifies patterns or come
to conclusions or recommendations.
IGTJP019 Recommendations of Food Delivery Application based on Sentiment

Score Computation
In the work the application provides the flexibility to collect offline reviews
(text data set) as well as online reviews from pre set of websites using Web
Crawler algorithm. The current approach also provides the concept of
features. The features can be described as follows food quality, delivery
time, cost and package cleanness. The next sequence is to determine the
sentiments namely Positive Polarity, Negative Polarity by dividing the
reviews into sentences and then computing the polarities per review and
per feature, also the polarities across reviews are added together to obtain
Food app based polarity. Once the food app based polarity and re obtained
then the food apps are ranked based on Positive Polarity Maximum,
Negative Polarity Minimum.
IGTJP020 Enhanced Password Processing Scheme Based on Visual Cryptography

and OCR(IEEE-2017)
Traditional password conversion scheme for user authentication is to

transform the passwords into hash values. These hash-based password
schemes are comparatively simple and fast because those are based on text
and famed cryptography. However, those can be exposed to cyber-attacks
utilizing password by cracking tool or hash-cracking online sites. Attackers
can thoroughly figure out an original password from hash value when that
is relatively simple and plain. As a result, many hacking accidents have been
happened predominantly in systems adopting those hash-based schemes.
In this work, we suggest enhanced password processing scheme based on
image using visual cryptography (VC). Different from the traditional scheme
based on hash and text, our scheme transforms a user ID of text type to
two images encrypted by VC. The user should make two images consisted
of subpixels by random function with SEED which includes personal
information. The server only has user’s ID and one of the images instead of
password. When the user logs in and sends another image, the server can
extract ID by utilizing OCR (Optical Character Recognition). As a result, it
can authenticate user by comparing extracted ID with the saved one. Our
proposal has lower computation, prevents cyber-attack aimed at
hashcracking, and supports authentication not to expose personal
information such as ID to attackers.
IGTJP021 Improved Approach For Infrequent Weighted Itemsets in Data

Mining(IEEE-2017)
Association rule mining is very transpiring today as there is growth in

finding frequent item sets as well as infrequent item sets in transactional
databases. Association rule mining good example. If a customer buys butter
and bread, customers buy milk as well. The Association rule mining goal is
to find correlation in different data sets from database. Item set mining is
very necessary in data mining used for finding the relationships in the
dataset. Frequent item set mining is finding frequently occurred items in a
transactional database where as infrequent item set mining is finding rare
transactions in the database in a transactional database.
IGTJP022 A Rating Approach based on Sentiment Analysis(IEEE-2017)
Sentiment Analysis is the study of analysis of opinions, expressions, likes

and dislikes of customers towards various entities like products, services,
organizations, individuals etc. With the exponent growth of social media
and ecommerce websites like flipkart.com, amazon.com etc. where people
can share their experiences about various products through web
descriptions, comments or ratings. Product repute is based on its
cumulative opinion of the online users. Sentiment analysis or
computational analysis of opinion has attracted a great deal of attention
due to various potential applications of sentiment analysis in e-commerce
domain, online discussion forums and web description sites. Sentiment
Analysis is challenging, as it doesn’t work well with basic lexical-based
classification. This is because the web descriptions are unstructured and
are written in natural language.
IGTJP023 A Novel Recommendation Model Regularized with User Trust and Item
Ratings
We propose Trust SVD, a trust-based matrix factorization technique for

recommendations. Trust SVD integrates multiple information sources into
the recommendation model in order to reduce the data sparsity and cold
star t problems and their degradation of recommendation performance.
An analysis of social trust data from four real-world data sets suggests that
not only the explicit but also the implicit influence of both ratings and trust
should be taken into consideration in a recommendation model. Trust SVD
therefore builds on top of a state-of-the-art recommendation algorithm,
SVD++ (which uses the explicit and implicit influence of rated items), by
fur there incorporating both the explicit and implicit influence of trusted
and trusting users on the prediction of items for an active user. The
proposed technique is the first to extend SVD++ with social trust
information. Experimental results on the four datasets demonstrate that
Trust SVD achieves better accuracy than other ten counter parts
recommendation techniques.
IGTJP024 Connecting Social Media to E-Commerce Cold-Start Product

Recommendation using Microblogging Information
In recent years, the boundaries between e-commerce and social

networking have become increasingly blurred. Many e-commerce websites
support the mechanism of social login where users can sign on the websites
using their social network identities such as their Facebook or Twitter
accounts. Users can also post their newly purchased products on
microblogs with links to the e-commerce product web pages. In this paper
we propose a novel solution for cross-site cold-start product
recommendation, which aims to recommend products from e-commerce
websites to users at social networking sites in coldstart situations, a
problem which has rarely been explored before. A major challenge is how
to leverage knowledge extracted from social networking sites for cross-site
cold-start product recommendation. We propose to use the linked users
across social networking sites and e-commerce websites (users who have
social networking accounts and have made purchases on e-commerce
websites) as a bridge to map users’ social networking features to another
feature representation for product recommendation. In specific, we
propose learning both users and products feature representations (called
user embeddings and product embeddings, respectively) from data
collected from e-commerce websites using recurrent neural networks and
then apply a modified gradient boosting trees method to transform users
social networking features into user embeddings. We then develop a
feature-based matrix factorization approach which can leverage the learnt
user embeddings for cold-start product recommendation
IGTJP025 Cross-Platform Identification of Anonymous Identical Users in Multiple

Social Media Networks
The last few years have witnessed the emergence and evolution of a
vibrant research stream on a large variety of online social media network
(SMN) platforms. Recognizing anonymous, yet identical users among
multiple SMNs is still an intractable problem. Clearly, cross-platform
exploration may help solve many problems in social computing in both
theory and applications. Since public profiles can be duplicated and easily
impersonated by users with different purposes, most current user
identification resolutions, which mainly focus on text mining of users’
public profiles, are fragile. Some studies have attempted to match users
based on the location and timing of user content as well as writing style.
However, the locations are sparse in the majority of SMNs, and writing
style is difficult to discern from the short sentences of leading SMNs such
as Sina Microblog and Twitter. Moreover, since online SMNs are quite
symmetric, existing user identification schemes based on network
structure are not effective. The real-world friend cycle is highly individual
and virtually no two users share a congruent friend cycle. Therefore, it is
more accurate to use a friendship structure to analyze cross-platform
SMNs. Since identical users tend to set up partial similar friendship
structures in different SMNs, we proposed the Friend Relationship-Based
User Identification (FRUI) algorithm. FRUI calculates a match degree for all
candidate User Matched Pairs (UMPs), and only UMPs with top ranks are
considered as identical users. We also developed two propositions to
improve the efficiency of the algorithm. Results of extensive experiments
demonstrate that FRUI performs much better than current network
structure-based algorithms.
IGTJP026 Cyberbullying Detection based on Semantic-Enhanced Marginalized

Denoising Auto-Encoder
In the proposed system the tweets are classified into a threat

tweet or not a threat tweet. This is done by utilizing a set of keywords
belonging to different categories. For each of category the probability is
computed, after the probability is computed contingency is computed
and after that sorting is performed in order to classify the tweet as
belonging different cyber threat category. Framework aids the E-crime
department to identify suspicious words from cyber messages and trace
the suspected culprits. Currently existing Instant Messengers and Social
Networking Sites lack these features of capturing significant suspicious
patterns of threat activity from dynamic messages and find relationships
among people, places and things during online chat, as criminals have
adapted to it.
IGTJP027 Disease Prediction by Machine Learning over BigData from Healthcare

Communities(mongo db)
With big data growth in biomedical and healthcare communities, accurate

analysis of medical data benefits early disease detection, patient care and
community services. However, the analysis accuracy is reduced when the
quality of medical data is incomplete. Moreover, different regions exhibit
unique characteristics of certain regional diseases, which may weaken the
prediction of disease outbreaks. In this paper, we streamline machine learning
algorithms for effective prediction of chronic disease outbreak in disease-
frequent communities. We experiment the modified prediction models over
real-life hospital data collected from central China in 2013-2015. To overcome
the difficulty of incomplete data, we use a latent factor model to reconstruct
the missing data. We experiment on a regional chronic disease of cerebral
infarction. We propose a new convolutional neural network based multimodal
disease risk prediction (CNN-MDRP) algorithm using structured and
unstructured data from hospital. To the best of our knowledge, none of the
existing work focused on both data types in the area of medical big data
analytics. Compared to several typical prediction algorithms, the prediction
accuracy of our proposed algorithm reaches 94.8% with a convergence speed
which is faster than that of the CNN-based unimodal disease risk prediction
(CNN-UDRP) algorithm.
IGTJP028 Detecting Malicious Facebook Applications Detecting Malicious Facebook

Applications
With 20 million installs a day , third-party apps are a major reason for the popularity
and addictiveness of Facebook. Unfortunately, hackers have realized the potential of
using apps for spreading malware and spam. The problem is already significant, as
we find that at least 13% of apps in our dataset are malicious. So far, the research
community has focused on detecting malicious posts and campaigns. In this paper,
we ask the question given a Facebook application, can we determine if it is malicious?
Our key contribution is in developing FRAppE—Facebook’s Rigorous Application
Evaluator— arguably the first tool focused on detecting malicious apps on Facebook.
To develop FRAppE, we use information gathered by observing the posting behavior
of 111K Facebook apps seen across 2.2 million users on Facebook. First, we identify a
set of features that help us distinguish malicious apps from benign ones. For example,
we find that malicious apps often share names with other apps, and they typically
request fewer permissions than benign apps. Second, leveraging these distinguishing
features, we show that FRAppE can detect malicious apps with 99.5% accuracy, with
no false positives and a low false negative rate (4.1%). Finally, we explore the
ecosystem of malicious Facebook apps and identify mechanisms that these apps use
to propagate. Interestingly, we find that many apps collude and support each other;
in our dataset, we find 1,584 apps enabling the viral propagation of 3,723 other apps
through their posts. Long-term, we see FRAppE as a step towards creating an
independent watchdog for app assessment and ranking, so as to warn Facebook users
before installing apps.
IGTJP029 NetSpam a Network-based Spam Detection Framework for Reviews in

Online Social Media(IEEE-2017)
Nowadays, a big part of people rely on available content in social media in

their decisions (e.g. reviews and feedback on a topic or product). The
possibility that anybody can leave a review provides a golden opportunity for
spammers to write spam reviews about products and services for different
interests. Identifying these spammers and the spam content is a hot topic of
research and although a considerable number of studies have been done
recently toward this end, but so far the methodologies put forth still barely
detect spam reviews, and none of them show the importance of each
extracted feature type. In this study, we propose a novel framework, named
NetSpam, which utilizes spam features for modeling review datasets as
heterogeneous information networks to map spam detection procedure into
a classification problem in such networks. Using the importance of spam
features help us to obtain better results in terms of different metrics
experimented on real-world review datasets from Yelp and Amazon
websites. The results show that NetSpam outperforms the existing methods
and among four categories of features; including review-behavioral, user-
behavioral, review linguistic, user-linguistic, the first type of features
performs better than the other categories.
IGTJP030 SMARTCRAWLER A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING

DEEP-WEB
As deep web grows at a very fast pace, there has been increased interest in
techniques that help efficiently locatedeep-web interfaces. However, due to
the large volume of web resources and the dynamic nature of deep web,
achieving widecoverage and high efficiency is a challenging issue. We
propose a two-stage framework, namely SmartCrawler, for
efficientharvesting deep web interfaces. In the first stage, SmartCrawler
performs site-based searching for center pages with the help ofsearch
engines, avoiding visiting a large number of pages. To achieve more accurate
results for a focused crawl, SmartCrawlerranks websites to prioritize highly
relevant ones for a given topic. In the second stage, SmartCrawler achieves
fast in-sitesearching by excavating most relevant links with an adaptive link-
ranking. To eliminate bias on visiting some highly relevantlinks in hidden
web directories, we design a link tree data structure to achieve wider
coverage for a website. Our experimentalresults on a set of representative
domains show the agility and accuracy of our proposed crawler framework,
which efficientlyretrieves deep-web interfaces from large-scale sites and
achieves higher harvest rates than other crawlers.
IGTJP031 A Shoulder Surfing Resistant Graphical Authentication System
This evolution brings great convenience but also increases the probability
of exposing passwords to shoulder surfing attacks. Attackers can observe
directly or use external recording devices to collect users’ credentials. To
overcome this problem, we proposed a novel authentication system Pass
Matrix, based on graphical passwords to resist shoulder surfing attacks.
With a one-time valid login indicator and circulative horizontal and vertical
bars covering the entire scope of pass-images, Pass Matrix offers no hint
for attackers to figure out or narrow down the password even they conduct
multiple camera-based attacks. We also implemented a Pass Matrix
prototype on Android and carried out real user experiments to evaluate its
memorability and usability. From the experimental result, the proposed
system achieves better resistance to shoulder surfing attacks while
maintaining usability.
IGTJP032 Improving Automated Bug Triaging with Specialized Topic Model(IEEE-

2017)
Software companies spend over 45 percent of cost in dealing with

software bugs. An inevitable step of fixing bugs is bug triage, which aims to
correctly assign a developer to a new bug. To decrease the time cost in
manual work, text classification techniques are applied to conduct
automatic bug triage. In this paper, we address the problem of data
reduction for bug triage, i.e., how to reduce the scale and improve the
quality of bug data. We combine instance selection with feature selection
to simultaneously reduce data scale on the bug dimension and the word
dimension. To determine the order of applying instance selection and
feature selection, we extract attributes from historical bug data sets and
build a predictive model for a new bug data set.
IGTJP033 Search Rank Fraud and Malware Detection in Google Play(IEEE-2017)
Fraudulent behaviors in Google Play, the most popular Android app market,
fuel search rank abuse and malware proliferation. To identify malware,
previous work has focused on app executable and permission analysis. In
this paper, we introduce FairPlay, a novel system that discovers and
leverages traces left behind by fraudsters, to detect both malware and apps
subjected to search rank fraud. FairPlay correlates review activities and
uniquely combines detected review relations with linguistic and behavioral
signals gleaned from Google Play app data (87K apps, 2.9M reviews, and
2.4M reviewers, collected over half a year), in order to identify suspicious
apps. FairPlay achieves over 95% accuracy in classifying gold standard
datasets of malware, fraudulent and legitimate apps. We show that 75% of
the identified malware apps engage in search rank fraud. FairPlay discovers
hundreds of fraudulent apps that currently evade Google Bouncer’s
detection technology. FairPlay also helped the discovery of more than 1,000
reviews, reported for 193 apps, that reveal a new type of “coercive” review
campaign users are harassed into writing positive reviews, and install and
review other apps.
IGTJP034 Trust-based Service Management for Social Internet of Things Systems
A social Internet of Things (IoT) system can be viewed as a mix of traditional

peer-to-peer networks and social networks, where “things” autonomously
establish social relationships according to the owners’ social networks, and
seek trusted “things” that can provide services needed when they come into
contact with each other opportunistically. We propose and analyze the
design notion of adaptive trust management for social IoT systems in which
social relationships evolve dynamically among the owners of IoT devices.
We reveal the design tradeoff between trust convergence vs. trust
fluctuation in our adaptive trust management protocol design. With our
adaptive trust management protocol, a social IoT application can adaptively
choose the best trust parameter settings in response to changing IoT social
conditions such that not only trust assessment is accurate but also the
application performance is maximized.
Head Office: No.1 Rated company in Bangalore for all
software courses and Final Year Projects
IGEEKS Technologies
No:19, MN Complex, 2nd Cross,
Sampige Main Road, Malleswaram, Bangalore
Karnataka (560003) India. Above HOP Salon,
Opp. Joyalukkas, Ma lleswaram,
Landmark: Near to Mantri Ma ll, Malleswaram
Bangalore.
Email: nanduigeeks2010@gmail.com ,
nandu@igeekstechnologies.com
Office Phone:
9590544567 / 7019280372
Contact Person:
Mr. Nandu Y,
Director-Projects,
Mobile: 9590544567,7019280372
E-mail: nandu@igeekstechnologies.com
nanduigeeks2010@gmail.com
Partners Address:
RAJAJINAGAR: JAYANAGAR:
#531, 63rd Cross, No 346/17, Manandi Court,
12th Main, after sevabhai hospital, 3rd Floor, 27th Cross,
5th Block, Rajajinagar, Jayanagar 3rd Block East,
Bangalore-10. Bangalore - 560011,
Landmark: Near Bashyam circle. Landmark: Near BDA Complex.
More than 12 years’ experience in IEEE Final Year Project Center,

IGEEKS Technologies Supports you in Java, IOT, Python, Bigdata
Hadoop, Machine Learning, Data Mining, Networking, Embedded,
VLSI, MATLAB, Power Electronics, Power System Technologies.
For Titles and Abstracts visit our website www.makefinalyearproject.com

Ieee Latest 2018-19 Data Mining, ML, Big Data, Ai Java Projects

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ieee Latest 2018-19 Data Mining, ML, Big Data, Ai Java Projects

Uploaded by

Copyright:

Available Formats

For: - B. E | B. Tech | M. E | M. Tech | MCA | BCA | Diploma |MS |M.

PROJECTS TITLES FOR ACADEMIC YEAR 2018-2019

Cyber Security is that branch of Computer Technology that deals with

IGTJP003 Early Prediction of Chronic Kidney Disease Using Machine Learning

IGTJP004 Simplistic Approach to Detect Cybercrimes and Deter Cyber

Criminal minded’ informal conversations on social media (e.g. Twitter) shed

IGTJP005 Review Spam Detection using Machine Learning(IEEE-2018)

Prior to buying a product, people usually inform themselves by reading online

IGTJP006 A Subword-based Deep Learning Approach for Sentiment Analysis of

IGTJP007 An Internal Intrusion Detection and Protection System by Using Data

Data-mining algorithm to detect communication patterns with company

IGTJP009 Toward Truly Personal Chatbots -

In the proposed method social media texts contains a huge commercial

IGTJP011 DeepMovRS-A unified framework for deep learning-based movie

IGTJP012 Emotion Recognition on Twitter Comparative Study and Training a Unison

Despite recent successes of deep learning in many fields of natural

IGTJP013 Integrating StockTwits with Sentiment Analysis for better Prediction of

In this project lot of data mining algorithms Sequential analysis, Sequential

IGTJP014 Emotion Recognition on Twitter Comparative Study and Training a

Despite recent successes of deep learning in many fields of natural language

IGTJP015 Automatically Identifying Themes and Trends in Software Engineering

IGTJP016 Accelerating Test Automation through a Domain Specific Language(IEEE-

The proposed approach makes use of Accelerating Test Automation

IGTJP017 Analyzing Sentiments in One Go A Supervised Joint Topic Modeling

The world wide web can be viewed as a repository of opinions from

IGTJP018 Senti Review Sentiment Analysis based on Text and Emoticons(IEEE-

In the competitive world the number of companies is increasing day by day.

IGTJP019 Recommendations of Food Delivery Application based on Sentiment

IGTJP020 Enhanced Password Processing Scheme Based on Visual Cryptography

Traditional password conversion scheme for user authentication is to

IGTJP021 Improved Approach For Infrequent Weighted Itemsets in Data

Association rule mining is very transpiring today as there is growth in

IGTJP022 A Rating Approach based on Sentiment Analysis(IEEE-2017)

Sentiment Analysis is the study of analysis of opinions, expressions, likes

We propose Trust SVD, a trust-based matrix factorization technique for

IGTJP024 Connecting Social Media to E-Commerce Cold-Start Product

In recent years, the boundaries between e-commerce and social

IGTJP025 Cross-Platform Identification of Anonymous Identical Users in Multiple

IGTJP026 Cyberbullying Detection based on Semantic-Enhanced Marginalized

In the proposed system the tweets are classified into a threat

IGTJP027 Disease Prediction by Machine Learning over BigData from Healthcare

With big data growth in biomedical and healthcare communities, accurate

IGTJP028 Detecting Malicious Facebook Applications Detecting Malicious Facebook

IGTJP029 NetSpam a Network-based Spam Detection Framework for Reviews in

Nowadays, a big part of people rely on available content in social media in

IGTJP030 SMARTCRAWLER A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING

IGTJP031 A Shoulder Surfing Resistant Graphical Authentication System

IGTJP032 Improving Automated Bug Triaging with Specialized Topic Model(IEEE-

Software companies spend over 45 percent of cost in dealing with

IGTJP033 Search Rank Fraud and Malware Detection in Google Play(IEEE-2017)

A social Internet of Things (IoT) system can be viewed as a mix of traditional

More than 12 years’ experience in IEEE Final Year Project Center,

For Titles and Abstracts visit our website www.makefinalyearproject.com

You might also like