Financial Instrument Forecast with Artificial Intelligence

In ancient times, trade was carried out by barter. With the use of money and similar means, the concept of financial instruments emerged. Financial instruments are tools and documents used in the economy. Financial instruments can be foreign exchange rates, securities, crypto currency, index and funds. There are many methods used in financial instrument forecast. These methods include technical analysis methods, basic analysis methods, forecasts carried out using variables and formulas, time-series algorithms and artificial intelligence algorithms. Within the scope of this study, the importance of the use of artificial intelligence algorithms in the financial instrument forecast is studied. Since financial instruments are used as a means of investment and trade by all sections of the society, namely individuals, families, institutions, and states, it is highly important to know about their future. Financial instrument forecast can bring about profitability such as increased income welfare, more economical adjustment of maturities, creation of large finances, minimization of risks, spreading of ownership to the grassroots, and more balanced income distribution. Within the scope of this study, financial instrument forecast is carried out by applying a new methods of Long Short Term Memory (LSTM), Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), Autoregressive Integrated Moving Average (ARIMA) algorithms and Ensemble Classification Boosting Method. Financial instrument forecast is carried out by creating a network compromising LSTM and RNN algorithm, an LSTM layer, and an RNN output layer. With the ensemble classification boosting method, a new method that gives a more successful result compared to the other algorithm forecast results was applied. At the conclusion of the study, alternative algorithm forecast results were competed against each other and the algorithm that gave the most successful forecast was suggested. The success rate of the forecast results was increased by comparing the results with different time intervals and training data sets. Furthermore, a new method was developed using the ensemble classification boosting method, and this method yielded a more successful result than the most successful algorithm result.


I. Introduction
Financial instruments are defined as contracts that generate financial assets and financial liabilities by financial market instruments. Financial market instruments are in the form of a payment instrument, financial rights and economic assets closely related to these (Korkmaz and Bakkal, 2011;Yavilioğlu and Delice, 2006). Financial instruments such as exchange rates, commodities, securities, crypto currencies, indices and funds are indispensable for our economic life. The accurate estimation of financial instruments is important in ensuring the effective management of personal, corporate, and national economies. Because, the accurate estimation means increasing the welfare of individuals as a result of investing more productively, ensuring maturity adjustment between the suppliers and demanders of funds, creating large finances by pooling small savings, minimizing the risk by establishing a balance between alternative financing, financial inclusion of ownership by including the small savers in capital markets, a decrease in the cost of using funds by an increase in competition and a more balanced income distribution. While it is quite difficult to correctly interpret the markets that are getting increasingly complicated, even for financial professionals, it is getting more difficult for individuals who do not have enough knowledge of financial literacy to interpret the markets correctly (Gümüş and Pailer, 2019).
At this point, an accurate estimation can be made using time series in which future data can be estimated by looking at the past data set consisting of regular and consecutive time data. However, since timeseries data are extremely comprehensive and increase each second, and periodic sub-breakages occur in the time series, the most accurate approach will be analyzing the estimation using artificial intelligence that is successful in financial-based big data analysis. As a matter of fact, financial instrument estimation using artificial intelligence is the most up-to-date financial instrument estimation method. There are also similar studies where financial instrument estimation is made using artificial intelligence. For example, time series analysis consisting of BTC/USD currency pair data and financial time series trend estimation are made using the LSTM and RNN neural networks (Safiullin et. al., 2016;Qi, Khushi and Poon, 2020). Similarly, in the forex markets, the studies include EUR/GBP, USD/EUR, USD/JPY, USD/GBP, USD/AUD, USD/CAD, USD/CHF, and USD/CNY (Ahmed et.al., 2020;Nagpure, 2019). On the other hand, there are studies in the literature in which ensemble classification algorithms and hybrid methods are also used in financial instrument estimation (Bui, Vu and Dinh, 2018;Chowdhury et.al., 2020).
Within the scope of the study, a network will be created from RNN and LSTM. Accordingly, a hybrid model will be created from CNN and LSTM. The ARIMA algorithm will be run simultaneously and the results obtained will be presented through the boosting channel, which is one of the ensemble classification methods. This way, unlike previous studies, a much more accurate estimation result will be obtained by using multiple techniques simultaneously and presenting them collectively. The financial instruments to be estimated are determined to be USD/TRY and ounce gold since they are the most traded financial instruments in Turkey. Thanks to the outputs, it is expected to provide a more inclusive perspective on financial instrument estimation to the literature and create added value for financial literacy.

II. Materials and Methods
In this section, the algorithms used in the scope of the study and how these algorithms are used are explained. In addition, the differences made within the scope of this study are mentioned. Firstly, the data sets studied are mentioned.

Data Set
Within the scope of this study, USD/TRY and ounce gold data sets are studied. While 2019 and 2003-2020 years are studied for the USD/TRY exchange, 2017 data are studied for ounce gold. These data are obtained from the website of the Central Bank of the Republic of Turkey (Turkiye Cumhuriyet Merkez Bankası -TCMB, 2019). Exchange rate reports for the desired date range can be obtained by selecting effective sales USD from the central bank EDDS (Electronic Data Delivery System) serial market page. For the 2019 USD/TRY exchange rate, 247 days of data are studied. For the 2003-2020 USD/TRY exchange rate, 4293 days of data are studied. For 2017 ounce gold, 250 days of data are studied.

Data Life cycle
One week of financial instrument information for data analysis studies is provided in Table 1. The date field is the date of the financial instrument. The value field represents the current sales exchange rate of the financial instrument. The estimation field represents the direction of the value to be realized the next day.

Financial Instrument Forecast with Artificial Intelligence
Page |10| Emerging Markets Journal Within the scope of the study, the life cycle of one-day data is provided below. The processing steps of one-day data are provided in Table 1. Some variables are added case by case for each one-day raw data. After these variables are added, data are processed for each algorithm. Then, the information is obtained by applying the boosting method, which is one of the ensemble methods.

Recurrent Neural Network
RNN has short-term memory and cannot process long-term data dependencies (Hochreiter, and Schmidhuber, 1997). Neural networks are designed according to the data they process. The difference between the RNN model and other models is that it is designed to process time series and sequence data. It has many application areas such as sound processing, image processing, video images, financial data analysis, natural language processing, weather forecast, DNA data, medication data, sentiment analysis, and movie recommendations. Time-series estimation is defined as the method of estimating future data by looking at past data.
RNN can estimate and classify future data by remembering the past data. It does this by remembering important data from past data. Accurate estimates are possible since it estimates by finding important points. Traditional recurrent neural networks (RNNs) remember the past by establishing recurrent links between past and current computations (Bengio, Y., Courville A., Goodfellow, 2016). Figure 2 provides the architectural structure of the RNN algorithm. The inputs are handled in connection with each other in the RNN algorithm. The nodes in the hidden layer have temporal lobes that feed themselves. It has its own small memory. It nourishes itself thanks to these temporal lobes. Although it outputs an input it receives, it does not forget the input. It remembers when it moves on to the next layer. Each layer receives both the input data and the past data.

Long Short Term Memory
Long Short Term Memory (LSTM) is a specialized version of RNN. Recurrent neural networks (RNNs) are used to learn the sequential patterns in time series data (Yu et. al., 2021). LSTM is a typical RNN type and can learn for a longer period compared to a simple RNN (Hochreiter and Schmidhuber, 1996). Unlike RNN, LSTM has long-term memory. Its most important feature is remembering long-term data.
It is possible to forget or keep the information in the memory while the LSTM algorithm runs. It has a forget gate, an input gate, and an output gate. In forget gate, decisions on forgetting the past are made. In the input gate, decisions on which information will be kept in memory are made. In the output gate, decisions on which information will be output are made (Olah, 2015).
In brief, LSTM is a special type of RNN that can keep information in long-term memory. Within the scope of this study, financial instrument analysis is made by creating a network compromising an LSTM layer and an RNN output layer.

Convolutional Neural Networks
Convolutional neural network (CNN) is an artificial intelligence that is now widely used in image processing (LeCun, Huang and Bottou, 2004). Generally, the CNN algorithm is used in image processing. To date, the CNN algorithm has become a technique that has been successfully used in many fields such as text sensitivity classification (Wang and Mahadevan, 2011), human activity classification (Harel and Mannor, 2011) (Kulis, Saenko and Darell, 2011;Duan, Xu and Tsang, 2012) and multilingual text classification (Prettenhofer and Stein, 2010;Li et.al., 2020).
If the CNN algorithm operation is to be told using a handwritten letter recognition sample, there is a handwritten letter as input. While the CNN algorithm is run, it tries to find the distinctive features of this letter. A filter is required in order to find the distinctive feature of the letter in the input picture from other letters. This filter should distinguish this letter from the other letters. This part is the convolutional layer part. After finding the distinguishing features in the pooling layer, the resolution, which is its distinguishing feature, is reduced. At this stage, orientation is ensured. In other words, the distinguishing feature is found independently from directions such as right, up or down. In short, after filtering, screening is carried out, these processes are repeated, then comes the smoothing and the classification (Sharif et. al., 2020).
In this study, financial instrument estimation is made by using the LSTM and CNN algorithms in a hybrid manner.

Autoregressive integrated moving average
Autoregressive Integrated Moving Average will be referred to as ARIMA for short. It is also known as the Box-Jenkins model as it was first found by Box and Jenkins. ARIMA models, which are the most known and commonly used among time series models, assume a linear relationship between the data constituting the time series and can model this linear relationship. They can also be successfully applied to time series that are steady or stabilized by various statistical methods (Kaynar and Taştan, 2009   Source: Authors' own compilation For example, (1,1,0) means that the autoregression degree has only 1 time difference and no moving average is made. Within the scope of this study, financial instrument estimation is made using the ARIMA Model and the results are compared with the results of other models.

Ensemble Classification Boosting
Ensemble classification methods aim to get more successful results by running multiple classification algorithms simultaneously. It is desired to obtain a more successful result than the success rates of the classification algorithms used together. Ensemble classification methods have been identified as one of the most effective developments in the data mining field in the last decade (Seni and Elder, 2010).
Ensemble Classification is defined as a new method created by using multiple data mining algorithms together, as it can be understood from the name of Ensemble Classification. Combining multiple methods aims to estimate the most accurate result. It is desired to create a general method with better performance by using multiple methods together in Ensemble Classification. Bagging, boosting and random forest are examples of ensemble classification. The main purpose of ensemble classification is to combine the results previously obtained using different classification algorithms and to obtain a result better than the best of these results (Kılınç et.al., 2015).
For example, assuming that you have collected estimations from 10 experts, the Ensemble Classification will strategically combine the estimations of these 10 experts to obtain a more accurate and robust estimation than the estimation made by each expert. As will be discussed later in this section, there are several different methods for creating a Ensemble Classification. This section will explain how the Ensemble Classification works and provides basic information on why it is typically recognized to achieve good generalization performance.
Based on the explanation of the Ensemble Classification in Figure 4, Ensemble Classification desires to obtain more accurate estimation by using multiple classification algorithms together. These classification algorithms are different algorithms such as decision trees, support vector and logistic regression. For example, assuming that the S1 algorithm has a 60% estimation rate and the S2 algorithm has a 70%, it is aimed to make an estimation above 70% by using the Ensemble Classification method. According to this example, estimation above 70% is possible using the Ensemble Classification method.

Boosting
Boosting method checks the results of multiple classification algorithms. It obtains a new result by focusing on the results of the classification algorithms. It is aimed to obtain a result that is stronger than the results of other classification algorithms. The term "boosting" refers to the algorithm family that can transform weak classification algorithms into strong algorithms (Zhou, 2012).
In the boosting algorithm, the results of multiple classification algorithms are assessed and a new result is obtained. To obtain this result, the results of the classification algorithms used are weighed. The weights given to these results are determined based on the previous success of the classification algorithms. These weights are updated with new results. The boosting method works as a regenerative procedure, focusing more on previously misclassified records.
Here is an example of the boosting method. Let's assume that a patient has certain symptoms. Multiple doctors are consulted, and a certain weight is given to assessments of each doctor. These weight will be given based on their previous diagnosis. The final diagnosis is the result of the new diagnoses created by the weights given (Han, Kamber and Pei, 2012). Within the scope of this study, the algorithm results are observed to be more successful using the boosting method.

III. Findings
An application has been developed to conduct studies and test the results. The chosen application development programming language is Java. Netbeans is preferred for application development and database creation. The libraries used to develop the application are Deeplearning4j, Java, Maven, Primefaces, Spring, Apache Tomcat, Derby Client, Netbeans, Apache Poi, Hibernate, WekaDeeplearning4j and Timeseries-forecast.
In order to apply the proposed solution, some of the data mining algorithms should be run and the results should be observed. To explain the Boost Functional Clustering Algorithm Flowchart in Figure 5, firstly, the trend is determined. After determining the trend, the success of each algorithm is determined according to the current trend. After the successes are determined, the algorithms are weighted parallel to the success rate. Estimation results are summed based on the weighted results. After summing the estimation results, the largest result is considered as the result of the boost functional clustering algorithm.

Figure 5. Boost Functional Clustering Algorithm Flowchart
Source: Authors' own compilation The success rates of the algorithms in the horizontal trend are checked. 98 days are in the horizontal trend within the 2019 USD/TRY exchange rate in total. LSTM has 32 days of unsuccessful estimation and 66 days of successful estimation within the horizontal trend. Table 2 provides the estimation results of the algorithms according to the horizontal trend. The order based on these results is LSTM, CNN and ARIMA. The weight points can be provided in this order.  Table 3 contains the weighting results of the one-day boost functional clustering algorithm for 28.01.2020. The result is obtained by summing all in groups. 2.1 Buying is compared to .0.8 Selling. Since 2.1 is greater, the result of the boost functional clustering algorithm will be Buying. Firstly, the data of a financial instrument should be uploaded into the system in order to test the application. After this, the data will become estimable. This section contains the test results of USD/TRY and ounce gold data on the application. The test results of the application can be compared with different financial instruments and time ranges.
The movements of medium-term security with a 1-year USD/TRY exchange rate are analyzed. Table 4 contains the 2019 USD/TRY exchange rate application test results. It has been observed that the proposed solution, which is the Boost FK algorithm, provides more successful results.

Source: Authors' own compilation
The movements of a long-term security with a 17-year USD/TRY exchange rate are analyzed. Table 5 provides that if the 17-year (2003-2020) USD/TRY exchange rate is run on the application, it gives different results compared to the 1-year data. A total of 4293 days from this rime range were analyzed. The success rate has decreased compared to one-year results.

Source: Authors' own compilation
The movements of medium-term security with a 1-year ounce gold exchange rate are analyzed. Table 6 contains the 250-day ounce gold data test results. Unlike the USD-TRY exchange rate, ounce gold has less volatility.

Source: Authors' own compilation
Within the scope of the study, a network consisting of RNN and LSTM was created. However, parallel to the creation of a hybrid model from CNN and LSTM, the ARIMA algorithm was also run simultaneously and the results obtained were presented with the boosting method, which is one of the ensemble classification methods. Thanks to these, unlike the previous studies in the literature, a much more accurate estimation result will be obtained by using multiple algorithms simultaneously and presenting them collectively. The financial instruments which were estimated were determined to be USD/TRY and ounce gold, since they are the most traded financial instruments in Turkey. Thanks to the outputs, a more inclusive perspective on financial instrument estimation was provided to the literature, and it is expected to create added value for financial literacy. Alternative investment time ranges and instruments were provided with different time ranges and different financial instruments.

IV. Conclusion
This study is on increasing financial literacy and how financial instrument estimation can be made more successful. Within this context, studies were carried out on USD/TRY and ounce gold data using the LSTM, CNN, ARIMA algorithms. During these studies, a network consisting of an LSTM layer and an RNN output layer was created. Besides, LSTM and CNN algorithms were used in a hybrid manner.
A solution was proposed to improve the estimation results by using these algorithms. This solution is applying the ensemble classification boosting method to the algorithm results. An application was developed to apply this solution, observe the results and compete them with the results of other algorithms.
Finally, using the developed application, the proposed solution and algorithms were tested with different time ranges and training data. Based on the test results, it was observed that the proposed solution Boost FK algorithm is more successful with 67% in ounce gold tests, 66% in 1-year USD/TRY exchange rate and 64% in 17-year USD/TRY exchange rate. It has been concluded Volume 11 No 2 (2021) | ISSN 2158-8708 (online) | DOI 10.5195/emaj.2021.229 | http://emaj.pitt.edu

Financial Instrument Forecast with Artificial Intelligence
Page |14| Emerging Markets Journal that, the ounce gold test results are more estimable compared to the USD/TRY exchange rate. Applying boosting method based on the performance evaluation results of the artificial intelligence algorithms using these findings made the results more successful.