About the Project
The amount of data in today’s world is ever increasing, and has even led to new terms describing this as in Big Data. Fully automated data mining technologies that can be used to understand this data currently do not exist apart from very expensive data mining software suites – which in any case require large amounts of human direction and interaction and so are not fully automated. The training for these suites is also very expensive resulting in the majority of companies being unable to afford either the time or the expense in using these software packages.
Technology developed at the University of Aberdeen over a number of years and resulting in a spinout company could help in solving this problem. This technology is based on the application of artificial intelligence techniques, as well as exploiting the power of statistical methods alongside modern computational power. It is specifically designed to allow fully automated data analysis and can potentially be applied to any type of problem.
This technology has general applicability and can be used on problems as varied as traditional condition monitoring type problems to financial instrument analysis to genomics analysis to the analysis of social media data. This project will look at improving these techniques and comparing them against other currently easily available methods. In particular, the project will examine the process of determining how important features can be determined, and how data can be automatically transformed to give the features of interest.
This project will focus on the task of automating data analysis tasks, particularly for numerical data but the techniques developed could be applied to other domains such as textual data analysis. The data mining process currently requires human interaction and guidance throughout the process, and this project will look to exploit artificial intelligence techniques in order to allow the automation of the data analysis and feature extraction tasks.
The project will build on existing research work in this area, and the problem can be broken down into a number of distinct but important parts. The identification of important and irrelevant features in a dataset is an important task and one that cannot currently be undertaken in an automated manner. This needs to be undertaken using explainable artificial intelligence methods so that when classification is not successful, the decisions that have been made can be explained and understood by a domain expert.
The overall methodology developed will be tested against synthetic and also real world datasets.
This work has great commercial value and will likely be of interest to companies in the data analysis field.
Candidates should have (or expect to achieve) a UK honours degree at 2.1 or above (or equivalent) in Engineering, Physics, Computing Science. Candidates require very good mathematical and very good programming skills, with an essential background in computer coding, algorithms, mathematics, data analysis.
APPLICATION PROCEDURE:
• Apply for Degree of Doctor of Philosophy in Engineering
• State name of the lead supervisor as the Name of Proposed Supervisor
• State ‘Self-funded’ as Intended Source of Funding
• State the exact project title on the application form
When applying please ensure all required documents are attached:
• All degree certificates and transcripts (Undergraduate AND Postgraduate MSc-officially translated into English where necessary)
• Detailed CV
Informal inquiries can be made to Dr A Starkey ([Email Address Removed]), with a copy of your curriculum vitae and cover letter. All general enquiries should be directed to the Postgraduate Research School ([Email Address Removed])
References
Abdul Aziz, A & Starkey, A 2020, 'Predicting Supervise Machine Learning Performances for Sentiment Analysis Using Contextual-Based Approaches', IEEE Access, vol. 8, pp. 17722-17733.[ONLINE] DOI: HTTPS://DOI.ORG/10.1109/ACCESS.2019.2958702
Ahmad, AU & Starkey, A 2018, 'Application of feature selection methods for automated clustering analysis: a review on synthetic datasets', Neural Computing and Applications, vol. 29, no. 7, pp. 317-328.[ONLINE] DOI: HTTPS://DOI.ORG/10.1007/S00521-017-3005-9
Starkey, A & Ahmad, AU 2018, Semi-automated data classification with feature weighted self organizing map. in ICNC-FSKD 2017 - 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery. Institute of Electrical and Electronics Engineers Inc., pp. 136-141, 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, ICNC-FSKD 2017, Guilin, Guangxi, China, 29/07/17.[ONLINE] DOI: HTTPS://DOI.ORG/10.1109/FSKD.2017.8392964
Starkey, A, Ahmad, AU & Hamdoun, H 2017, 'Automated Feature Identification and Classification Using Automated Feature Weighted Self Organizing Map (FWSOM)', IOP Conference Series: Materials Science and Engineering, vol. 261, no. 1, 012006, pp. 1-7.[ONLINE] DOI: HTTPS://DOI.ORG/10.1088/1757-899X/261/1/012006
Abdul Aziz, A, Starkey, A & Campbell Bannerman, M 2017, Evaluating Cross Domain Sentiment Analysis using Supervised Machine Learning Techniques. in Intelligent Systems Conference 2017., 17652472 , IEEE Explore, London, SAI Intelligent Systems Conference 2017 (IntelliSys 2017), London, United Kingdom, 7/09/17.[ONLINE] DOI: HTTPS://DOI.ORG/10.1109/INTELLISYS.2017.8324369