top of page
Tsunami_Wave.jpg

Intro

This project was my first project in the field of Machine learning and Statistics. I completed this project under the supervision of Prof. Saud Afzal Mohammad, Dept. of Civil engineering. This project proved to be an opening window to the world of machine learning and Artificial intelligence. 

From then onwards I have been pursuing this field. I worked on the estimation of the return period of the wave height using different statistical methods and ML. 

IIT_Kharagpur_Logo_edited.png

Return period estimation

May - July 2018

About my project

How I did it

During my summer vacation of second year, I asked my professor for a research project in machine learning as I have just completed an online course on ML. Upon discussing my interests and skills, I was given a project on sea wave height return period estimation and forecasting. I completed this project as a part of summer research internship at IIT Kharagpur. The aim of this project is to replace the current physical simulation based model for sea wave height generation as such models require great computational power. Further the results obtained can used in advance for sea energy generation, decrease coastal damage.

Problem statement: Given the dataset for sea wave height, estimate the return period of sea waves and forecast the wave height using Machine Learning. 

The Project can be divided into two major parts:

  • Data collection and analysis

  • Distribution model and ML

I worked on this project for about 2 and a half months and this project helped me in gaining insights using probability statistical modeling and ML. 

Screenshot (236).png

Skills and Tools

  • MATLAB​​

  • GEVT and GPD distribution function

  • Linear Regression

  • Statistics

  • LaTeX

Screenshot (233).png

Data collection and  analysis

Currently, physical simulation-based models are used to predict the wave height at seashores. These models require great computational power and are not reliable. One such example is SWAN model (Fig 1). They are built on the hydrological equations of surface waves. Our aim is to reduce the computational cost at the same time without compromising with the output. 

  • Test site was taken as Mehamn Harbour bay sea height data. It contains total of 18 variables (Fig 4) including wave height, temperature, humidity etc. 

  • The readings were taken in the interval of 3 hours each for 60 years. 

  • The data was plotted and standardized for better visualization. (Fig 3) 

Screenshot (240).png

Distribution model and ML

  • For estimating return period we used Generalised extreme value theory (GEVT) and generalized Pareto distribution on the basis of the distribution obtained by plotting the data

  • We calibrated the value of scale, shape and threshold parameter to get the best fit of the distribution. 

Result

By applying the RNN model we get the RMSE value of 1.8975 m

bottom of page