CHAPTER 1 INTRODUCTION The electrical power systems are of immense importance for the modern society

CHAPTER 1
INTRODUCTION
The electrical power systems are of immense importance for the modern society. It is the backbone of all infrastructural and electrical applications. It is hard to imagine the present society without the use of power systems. The electrical power system consists of mainly three stages, the generation system where power generation takes place using different methods, the transmission system where the generated power is supplied from the generating end to the distributing end and the distribution system that supplies the power demand of the consumers. Even today we are still to address the challenge of storing high levels of electricity for later usage, so it becomes necessary to maintain the difference between production and consumption less at all times. If this difference is disturbed, then there is a chance of occurrence of blackouts which might prove fatal to the grid and cost a huge amount of money to repair and restore the balance. Hence, to maintain the reliability of the grid it is important that even in the face of huge dynamic demand, the generation-load difference must always be kept in check.

So, the generation, transmission, and distribution system have a necessity of forecasting the load so that the balance between generation and load can be maintained adding to the efficiency, security, and economy of the electrical infrastructure. Electrical load forecasting techniques are used at the generating points to schedule the generation in such a way that the resent and the future load demand are supplied efficiently. Whereas, the transmission and distribution systems use electrical load forecasting techniques to optimize the power flow in the transmission and distribution networks to reduce contingency and overloads. Load forecastingTplay a crucial role in energyTmanagement of the power system. Accurate load forecasting helps the electric system to make unit commitment decisions, schedule device maintenance and also plays an important role in minimizing the generation cost and is essential for the reliability of power system. The system operators use the forecasted results as a base to determine whether the system is vulnerable. Sometimes corrective actions are needed based on the analysis of the system, such as load shedding, power purchase as required based on the forecast that can decide the bi-lateral purchase of energy in the day-ahead market, asset commitment and also reduce the peak load demand. It also plays a crucial role in power system planning since the infrastructural expansion, maintenance and accurate forecast values can save a significant amount of capital.

1.1 CLASSIFICATION OF LOAD FORECASTS
The different ways of classification and prediction of electrical load forecasting based on the input information and future horizons are Very Short-Term Load Forecasting (VSTLF), Short-Term Load Forecasting (STLF), Medium Term Load Forecasting (MTLF) and Long-Term Load Forecasting (LTLF).
Very Short-Term and Short-Term Load Forecasting is carried out with time periods ranging from minutes to weeks, which are essential for ensuring the stability of the system, scheduling, control of the power system, security analysis and input for Contingency Analysis.
The Medium-Term forecasting has a time period ranging from weeks to months ahead which are required for scheduling the maintenance.
The Long-term forecasting has a time period ranging from months to year and is used for determination of the capacity of generation, transmission and distribution expansion along with the annual maintenance schedule.
Load forecasting methods can be also classified in terms of their degrees of mathematical analysis used in the forecasting model. These are presented in two basic types, quantitative and qualitative methods. The qualitative forecasting methods are generally used by planners to forecast accurately, these methods are Curve fitting and technological comparisons including other methods.
The load forecasting techniques may be classified into three major groups:
Traditional Forecasting technique
Modified Traditional Technique
Soft Computing Technique.

1. Traditional Forecasting:
Traditional Forecasting is one of the most important topics to predict future load demands for planning the infrastructure, development trends and index of overall development of the country etc. In early days, these forecasts were carried out using traditional/conventional mathematical techniques. With the development of advanced tools, these techniques have been augmented with the finding of researchers for more effective forecasting in various fields of study.
The classification of this technique is as follows: regression, multiple regression, exponential smoothing.

a. Regression method is one of the most widely used statistical techniques and it is easy to be implemented. The regression methods are usually employed to model the relationship between load consumption and other factors such as weather conditions, day types and customer classes. This method assumes that the load can be divided between a standard load trend and a trend linearly dependent on some factors influencing the load.
b. Multiple Regression is the most popular method and often used to forecast the load affected by a number of factors ranging from meteorological effects, per capita growth, electricity prices, economic growth etc. The technique used in Multiple Regression analysis for load forecasting is a least-square estimation. A least-squares approach was used to identify and quantify the different types of the load at power lines and substations.
c. Exponential smoothing is one of the approaches used for load forecasting. In this method, the first load is a model based on previous data, then to use this model to predict the future load 1.

2. Modified Traditional Techniques:
The traditionalTforecastingTtechniques are improved so that the auto-correction of the parameters ofTforecastingTmodel under varying environmental conditions. Some of the modified version of these traditional techniques are adaptive load forecasting, time series and support vector machine-based techniques.
a. Track of the changing load conditions which automatically corrects the Adaptive Demand Forecasting model parameters. The state vector is estimated using current prediction error and the current weather data acquisition programs. The state vector is determined by total historical data set analysis. This approach has distinctive features: The handling cyclic patterns use autocorrelation optimization & in insertion to updating model parameters, the order, and structure of the time series will adapt to new conditions.

b. Time series methods are based on the assumption that the data have an internal structure, such as autocorrelation, trend, or seasonal variation. Time series have been used in the ?elds of economics, digital signal processing, and electric load forecasting. In particular, ARMA (autoregressive moving average), ARIMA (autoregressive integrated moving average), ARMAX (autoregressive moving average with exogenous variables), and ARIMAX (autoregressive integrated moving average with exogenous variables) are the most often used classical time series methods.

c. Support Vector Machine (SVM) technique is a powerful machine learning method based on statistical learning theory (SLT), used for classification and regression analysis by analyses data and recognizes patterns. The temperature and other climate information are not much use for mid-term load forecasting and the introduction of time series forecasting may improve the results.

3. Soft Computing Techniques:
Soft Computing technique has transpired to deal such models effectively. It has been very widely in use over the last few decades. Soft computing is a transpire approach which corresponds the exceptional ability of the human mind to learn and reason in an environment of unpredictable and imprecision. It is emerging as a tool to help computer-based intelligent systems twin the ability of the human mind to utilize modes of reasoning that are inexact rather than exact. Soft computing forms a collection of order which includes fuzzy logic (FL), neural networks (NNs), evolutionary algorithms (EAs) etc.
a. Fuzzy logic is a generalization of the usual Boolean logic used for digital circuit design. An input under Boolean logic takes on a truth value of “0” or “1”, and an input has a relationship with a definite qualitative range. For example, a transformer load may be “low”, “medium” and “high”. Fuzzy logic enables one to logically conclude outputs from fuzzy inputs. In this sense, fuzzy logic is one of a number of techniques for mapping inputs to outputs. After the logical processing of fuzzy inputs, a “defuzzi?cation” process can be used to produce such precise outputs.

b. Neural Network (NN) or artificial neural networks (ANN) have very wide applications because of their ability to learn. Neural networks give the potential to solve the reliance on a functional form of a forecasting model. There are many types of neural networks: multilayer perceptron network, self-organizing network, etc. There are multiple hidden layers in the network. In each hidden layer, there are many neurons. The forecasting and its superiority to traditional methods by ANN based on back-progression are proved. Recurrent NN uses the intrinsic non-linear dynamic nature of Neural Network (NN) to represent the load as the output of some dynamic system, affected by weather, time and environmental variables 2.
1.2 FACTORS CONSIDERED FOR LOAD FORECASTING
Load forecasting is done based on several different factors influencing the load like social, economic, environment and time. Hence, there is no specific method of forecasting the load that can be used for all types of utilities. The forecast is also dependent on electrical loads as there are different types of consumers and with different load consumptions patterns. Sometimes the consumption pattern might increase if there is an extra load on the system due to socio-economic factors like any big event, sports competitions or religious occasions which might affect the demand load. In addition to that, the ever-persisting factor that influences furcating is that the load curves at the consumer’s end will be different on weekdays from those during the weekends. During holidays a different pattern is seen irrespective of it being a part of the weekdays or the weekend, this is also a crucial factor of consideration while furcating the load to obtain more reliable and accurate data to predict the future demands.

Weather is another important parameter, which refers to the condition of the meteorological elements like temperature, humidity, wind, rainfall etc., and how they vary in a region over a given period of time. Climate refers to the average weather conditions over a defined period of time in a certain definite area. The temperature and humidity alone have a great impact on the energy consumption among all the other meteorological element. The inclusion of the temperature and humidity constraints can increase the accuracy of the forecasted load values as they affect the generation and transmission of electricity to a great extent.

There are many important factors which are considered for improving load forecasting techniques,
Meteorological Factors: Weather, climate, temperature, humidity and solar radiation etc.,
Calendar Factors: Hours of day, days of week and timings of years etc.

Economic Factors: Industrial development, electricity price.

Random Factors: Sports Activities, Festival etc.

Customer Factors: Type of consumption, Size of the building, Electric appliances, Number of employees etc.

1. Meteorological Factors:
The peak and minimum temperature, rainfall, wind speed, cloud coverings, humidity, and snowfall are the affected factor of load forecasting. Most of the time they are correlated with temperature is correlated with humidity, rainfall and cloud cover. Some of the important factors in meteorological factors are;
A. Climate: Climate is the average weather over a definite period in a definite area. It may be different in different areas like high and low altitude or near and far from the sea. The climate is also depending on time like seasonally, annually or over a decade. The climate is one of the important factors for long-term load forecasting.

B. Weather factors: The weather is the atmosphere that exists for a short period in the specific area. The prediction of weather is complicated as it’s varied within a short time. The short-term load forecasting considers this as one of the important factors. The variations in weather affect the comfort level of the customers like the use of heaters, coolers, and geysers. The weather factors include, Temperature, Wind speed, Humidity, and Cloud covering.

1. Temperature: It is the most important in weather factor. Temperature is an average measure of hot and coldness of the atmospheric air in a definite area for a definite time. It is clear that the cloud cover and wind speed has a correlation with the load. When there is an increase in temperature the energy consumption for coolers also increases and when temperature decreases the use of heater increases. This factor correlates with the load curve.

2. Wind speed: Wind speed is defined as the speed of air. Wind speeds also affect the electricity load consumption. In the summer windy day, it feels comfortable and hence less number of cooling appliances will be used, load consumption would be lower. During the windy day of the temperature falls and the heating appliance is required which increases the load consumption.

3. Humidity: Humidity is the presence of water vapors in the air. Humidity affects short-term load forecasting. Humidity increases the feeling of increase in temperature during summer, rainy season and makes people use more cooling appliances. Hence load consumption increases during summer humid day.

4. Cloud cover: The cloud cover is mass of cloud, over all or most of the sky which causes an obstacle to wind flow. The timings of usage of electricity depend on cloud covering. During the night the cooling effect due to clouds temperature is already low in absence of sunlight. But due to low air circulation, the cloud covers at night may raise the temperature for the same night and next day. Hence load consumption during next day will be high. During daytime, the cloud cover may disturb the sunlight, hence decrease in the atmospheric temperature and hence lower the usage of electricity consumption 3.

2. Calendar Factors:
Calendar factor is the influence of the calendar variation of the same month or quarter between different years. In a daily load pattern, the higher load consumption periods are at definite timings. In different seasons load consumptions are different, weekends compared to weekdays, timings of the year leading to a festival such as Diwali, New Year etc. Generally, calendar factors are classified as:
A. Working days: There are significant variations in usage of electricity working days and holidays. The load pattern is different for different days. For example, Tuesday to Thursday load consumption are similar but on Mondays and Fridays has sudden variation due to adjacent weekend holidays.
B. Moving holiday effects: Moving holidays such as Holi, Diwali, Dussehra, Eid, Raksha Bandhan in India may occur in different weeks or months and might affect the load forecast as on these holidays industrial activities will be minimum.

C. Time factor: The electric load varies with time depending on human and economic activity. There are more loads in the day-time and less load at night. Time factor can be the season of the year, weekend or weekdays or hours of the days. It is very difficult to forecast holidays as their occurrence is not fixed in a month of the year. The daily load pattern has a clear idea of daily activities of human lifestyle like working hours, leisure hours and resting hours. There are some specific patterns of load variations with time. Due to less activity and works in industries and offices, the weekend or holiday load is lesser than weekdays. The cyclic time dependency leads to analyzing the load; on the hour of day basis, the day of week basis and time of year basis 4.

1.3 MOTIVATION
The power system has an enormous importance on present society. It is a backbone of all infrastructure and electrical appliances which operates now a day. The three major parts of the electrical power system are;
The electrical power plant where the electricity is generated.

The transmission and distribution used to send electrical energy through the electrical network.

The end user who consumes the electrical energy supplied by the electrical network.

Nowadays, it is difficult to store a certain amount of electricity. Hence it is necessary to maintain the equilibrium in the system. The evolving technology to store the significant data which has the impact on the power system. The smart meters providing the 2-way communication between utility and consumers were the load consumption is recorded. The load is also dependent on different factors which influence the load consuming patterns. The weather is one of those important factors influence the load. Load forecasting having the ability to predict the future load based on the historical data obtained from the smart meters and the weather parameters considered at corresponding time interval is used for the real-time operation and control, peak demand management etc. gives a helping hand for the electrical power system.
1.4 OBJECTIVE
The main objective of the project is to develop a high accuracy load forecasting technique based on the historical data of load and weather data i.e. ambient temperature as an independent variable using Time Series load forecasting technique.
The short-term load forecasting considers the time interval of one-day to a one-week set of values is considered for the analysis.

To forecast of load value considered is developed an algorithm of time series forecasting using R-tool.

The result obtained from the time series forecasting of electrical load which helps to improve the forecasting based on the effect of a number of load samples considered an influence of weather on load forecast.

1.5 METHODOLOGY
1. The electrical load data from the meters logged every minute and the weather parameter i.e. temperature value at a current interval of time is to be used as the input data for time series load forecasting in R-tool.

2. The input data is divided into two part, one is data for one-day load data with and without weather data consideration and another is one-week load data with and without weather data consideration.

3. The data sample is divided into two sets; train data is applied to time series forecasting algorithm to obtain the forecast and test data will be compared with the forecasted data for the analysis of accuracy.

4. The R software which supports statistical, data analysis and the graphical representation is used to develop a platform for Time Series Forecasting.
1.6 ORGANIZATION OF THESIS
The introduction to the project which gives the information and importance of load forecasting in the electrical power system, the classification of load forecasting, the external factor influences on load forecasting, the main motivation of selecting this project, the objective and methodology of this project is explained in detail in chapter 1.

Chapter 2 is the literature survey carried out to understand the overview of the load forecasting like classification, different techniques, and factors affecting load forecasting. In addition to this the literature review of the particular base analysis of short-term load forecasting and external factor i.e., whether parameter consideration on load forecast. The brief introduction of the R Software used is described in this chapter.

The load forecasting technique considered in this thesis is time series forecasting and in that ARIMA model is considered. Hence chapter 3 gives the brief explanation of time series load forecasting and its different model. In continuation with that, the time series forecasting algorithm developed for the input parameter considered in R-Tool is described in this chapter.

The result obtained by the time series ARIMA algorithm develop in R-Tool i.e., the forecasted plot and the error is analysed and discussed in chapter 4.
Chapter 5 is the conclusion of work carried out in this project and work that can be done to improves the carried present developed algorithm in future.

CHAPTER 2
LITERATURE SURVEY
3067050261620Modeling
Process
00Modeling
Process
2.1 OVERVIEW
164782550800Weather
forecast
00Weather
forecast
269557531432511715753619504505325209550Load Forecast
00Load Forecast
5715047625Weather data
0Weather data

420052511938057150262255Load data
0Load data

117094038735
Fig 2.1: A typical short-term load forecasting process
Figure 2.1 is the typical short-term load forecasting which shows the process carried out. The forecasted weather and load data are taken as an input for modeling process and the output is obtained based on this data input. The modeling process captures the variations, as an input plays a crucial role in forecast accuracy. A large variety of research in the field of STLF has been done to improve the modeling process.
The various techniques for STLF have been developing by the power engineers for tackling the time series forecasting problems and these techniques can be classified as Statistical approach like Regression analysis 5 and Time Series Analysis 6, and Artificial Intelligence based approach like ANN 7, fuzzy Logic 8, and Support Vector Machine (SVM) 9. The STLF has been studied by a various combination of this technique. The effect of the various parameter on electric load consumption was found. At first, temperature and humidity were considered 10, the effect of wind speed and humidity are considered by a linear transformation of temperature in advance version 11. Normally the electric load is driven by human activities and nature. The effects of human activities are normally calendar variables and nature effects are normally weather variables like temperature, humidity, wind speed etc.

Weather forecast as an important process plays a crucial role in STLF accuracy. And many of the research is being carried out on improving, developing and incorporating the weather parameter for forecast 12. An STLF along with temperature forecast has been proposed 13. The STLF to improve the performance the front-end weather forecast has been incorporated 14.
2.2 REVIEW OF PAPERS
2.2.1 Conceptual Reviews
The literature review of load forecasting can be traced back to 1918 15,16. In the early 80s, two paper of load forecasting bibliography was published by the load forecasting working group in IEEE 17,18. In the past 40 years, the STLF development has been done on various perspective.
Abu-El-Magd and Sinha have reviewed many online and offline methods, which includes multiple regression techniques, exponential smoothing method, stochastic time series approach and state-space approach 19. The identification of part if STLF was focused in this review. The advantage and disadvantage of all the approaches are discussed in this review. The authors claimed state-space approach and time series approaches are extensively used for STLF. The time series approach was easy to apply and understand but the difficulty was to update online was written in the paper.

The paper written by Gross and Galiana is the review paper on STLF instead of focusing on techniques 20. The review was discussed on one hour to one week forecast of hourly data of system load. The role of online in STLF in the security function and online scheduling of an energy management system. The authors reviewed that the similar day method was being replaced by time series approach and state-space approach.

2.2.2 Experimental Reviews
Five techniques were explained in 21, multiple linear regression, exponential smoothing, time series, state-space method and knowledge-based approach. The comparative analysis of these five techniques which would help to understand the inherent level of difficulty and the expected outcome. Based on the data from the southern utility of US, these five techniques are used to generate hourly load forecast. The author compared and analyzed the results of each technique.

Taylor and McSharry proposed to compare and pick the best model from five model considered in this paper 22. Auto-Regressive Integrated Moving Average (ARIMA) model, AR model, double seasonality of Holt-Winters exponential smoothing, an alternative exponential smoothing, and principle component analysis based method. The forecast was on the hourly interval for the horizon of 24-hours based on the data from 10 European countries. The result was an analysis based on Mean Absolute Percentage Error (MAPE) and Mean Absolute Error (MAE) in this paper. The author concludes that the more accuracy can be obtained in the forecast for the long lead time when weather parameter is considered.

2.2.3 Weather Variables
It is known that weather has the strong effect in load, where the electrified air conditioners are heavily used. Various weather variables are used in different ways was reviewed by the author. Some use weather variable like temperature, dew point temperature, wind direction, wind chill index (WCI) and heating/cooling degree days or other variables like temperature humidity, cloud cover and wind speed 23,24.

The temperature is most widely used weather variable among all listed above. The temperature is considered in many ways like current hour temperature, previous hour temperature, the difference in the present and previous hour temperature, the minimum, maximum or average temperature of few hours etc. The load and temperature relationship can be modeled differently. Hagan and Behr reviewed the 3rd ordered polynomial 6 and Fan et al proposed a piecewise linear relationship 25.

The weather information used is dependent on the meteorological condition of the defined area, the history and forecast value availability or the time of year. The authors made the study in South Korea 26, it is seen that the spring, fall, and winter seasons change in temperature is small. Hence the effect of temperature on load consumption is small. The load consumption dramatically changes in summer due to the use of air conditioners.

2.3.4 Time Series Analysis
The regression models were combined with the ARIMA model for STLF 27. The ARIMA model was used to produce a forecast of the weather-normalized load. Whereas in regression models used to forecast the peak, tough and weather-normalized load or the weather-sensitive trend is removed from the load series.

ARIMA model along with Box Jenkins time series models was applied to short-term load forecasting in Hagan and Behr’s paper 6. In this paper, the 3rd order polynomial of the temperature is considered for the non-linear relationship between temperature and load. Time series models like ARIMA model, transfer function model and transfer function model with non-linear transformation were compared for three periods like summer, spring and winter in 1984. All the time series method considered in this paper performed well by analysing the result obtained. The knowledge of experienced human operators is incorporated with ARIMA model by Amjady 28. The modified model estimates the initial forecast and combines this initial forecast with the load and temperature data for multiple regression models to predict the final forecast. The author considered eight modified ARIMA models to forecast the hourly load for four types of days in cold and hot condition to forecast peak load demand. In this paper, three years of data is taken from the dispatch canter of Iran for the analysis. The two-year data is considered for training and one year of data is considered for the testing purpose. The proposed method is compared with ARIMA and ANN models. The modified model of ARIMA forecasted better accuracy for STLF than the other approach.

A modified approach of ARMA was reviewed in the paper includes the non-Gaussian process considerations 29. An adaptive ARMA model was compared and tested with traditional Box and Jenkins approach and adaptive ARMA model showed the better accuracy 30.

2.3 TOOL USED
The project is carried out with the help of R-software which is an open source software. This software is one of the suitable software which has the ability to hold a large amount of data and provide the proper analysis. R-Tool is both environments for statistical analysis and graphical representation and a language. Many different statistical methods like regression modeling, time series analysis etc. can be developed in R. The R software has the ability to develop mathematical and formulae which in turn produce the quality plot.

The main features of R software include sufficient data can be stored which is handled effectively and efficiently, proper analysis tool, properly plotted graphical representation and the programming language is with a user-defined function, simple command lines etc. R software can be run on all the platform i.e., Windows, MacOS, and LINUX machine.

CHAPTER 3
TIME SERIES FORECASTING
The process of predicting future load values for a number of activities like planning the to maintain the load side and generation side and many more. In this project aims at the study of how independent variable play role to increase the accuracy of forecasting. This project makes use of time series forecasting, the Auto-Regressive Integrated Moving Average (ARIMA) model is used for load forecasting.
3.1 INTRODUCTION TO TIME SERIES FORECASTING
A series of data point index in time order is called time series. The equally spaces sequence is taken as a time series. Time series is normally used in signal processing, pattern recognition, forecasting, control engineering, earthquake prediction and in the field of engineering and applied science. Time series analysis methods are used for analysing time series data to obtain the significant statistics and characteristics of the data. Time series forecasting is a model to forecast the future values based on the previous values considered.
The first and for most step-in time series is to check that whether the data considered is non-stationary or stationary. The transformation of data from non-stationary to stationary if the considered data is non-stationary. There are many methods to obtain the accurate forecast from the stationary data considered.
The widely used time series forecasting models are ARIMA models and Exponential Smoothing. ARIMA models are normally described by data correlations and Exponential Smoothing is described by seasonality and trend in data considered. This project uses ARIMA model for the forecast for the accuracy analysis. It is necessary to go understand the Auto-Regressive (AR), Moving Average (MA), Auto-Regressive Moving Average (ARMA) to understand the ARIMA model.

3.1.1 Auto-Regressive (AR) model
The Auto-Regressive model states that the output value is dependent on the previous value and a stochastic term. the electrical load of a day, a week along with temperature and holiday etc. is the predictor value and the future load is to be forecasted based on the variable of interest.

The Auto-Regressive model forecast with the help of linear past value. Auto Regression is the regression of variable itself. The representation AR(p) is the Auto Regression Model of order p and it is defined by,
Xt= c + i=1p?iXt-i+?t …………………………………………………… (3.1)
Where ?1,….,?p are the parameters of the model, c is constant and ?t is white noise.
The advantage of the Auto Regression model is the handling of time series patterns. Also, the present value in a series is correlated with the previous value i.e., it has infinite non-zero correlation coefficient.

3.1.2 Moving Average (MA) model
The common approach used for modeling univariant time series is Moving Average (MA) model. In this model, the output variable is depended linearly on the present values and various historical values of a stochastic term. Moving Average model of order q can be represented by MA(q).

Xt=?+?t+?1?t-1+…+?q?t-q………………………………………. (3.2)
Where ? is the mean of the series, ?1,…..,?q are the parameter of the model and ?t,?t-1,….,?t-q are white noise error terms. The parameter q is the order of MA model.

The historical values of the forecast error are the white noise process in the sequence of a distributed variable with constant vary and zero mean. The forecast error is assumed to follow the regular distribution. Moving average models are regression model of present observation of time series against forecast error. Moving Average model fitting is complicated than the Auto Regression model.

3.1.3 Auto-Regressive Moving Average (ARMA) model
The Auto-Regressive and Moving Average model combine to form Auto-Regressive Moving Average model (ARMA) which is a time series model. The AR involves the variable on its past value. The MA involves modeling error term as a linear combination of error terms occurring at the various time in past. The model ARMA(p,q) where p is the order of Auto-Regressive part and q is the order of Moving Average part. The model can be represented by,
Xt=c+?t+i=1p?iXt-i+i=1q?i?t-i…………………………………… (3.3)
Where p is the order of Auto-Regressive model and q is the order of Moving Average model.

The model can be specified in terms of the log operator L. Then the model AR(p) is given by,
?t=1-i=1p?iLiXt=?(L)Xt………………………………………….. (3.4)
Where ? is given by,
?L=1-i=1p?iLi………………………………………………………. (3.5)
The MA(q) model is given by,
Xt=1+i=1q?iLi?t=?(L)?t………………………………………….. (3.6)
Where ? represents the polynomial
?(L)=1+i=1q?iLi………………………………………………………… (3.7)
The combined model ARMA(p,q) is given by
1-i=1p?iLiXt=1+i=1q?iLi?t……………………………………… (3.8)
From above equations 3.5 and 3.7,
?LXt=?(L)?t……………………………………………………………… (3.9)
Or
?L?(L)Xt=?t………………………………………………………………… (3.10)
The main advantage of the ARMA model is to provide a simple representation of the system.

3.1.4 Auto-Regressive Integrated Moving Average (ARIMA) model
In time series, an Auto-Regressive Moving Average (ARMA) model is generalized to Auto-Regressive Moving Integrated Average (ARIMA) model. Both of these models are fitted to time series data either for forecasting or to understand the data. When the data is of non-stationary type, an initial differentiation can be done one or more times till the data becomes stationary.

The Auto-Regressive (AR) part of ARIMA is the evolving variable of interest is regresses on its prior values. The Moving Average (MA) part is the regression error which is a linear combination of error terms occurring at the various time in past. The Integrated (I) indicates the data values which have been replaced by differencing present value to the previous value.

Non-seasonal Auto-Regressive Integrated Moving Average (ARIMA) model can be represented as ARIMA(p,d,q) where p,d, and q are non-negative integers, p is the order of AR, d is the degree of the number of times the present value is subtracted from past value, q is the order of MA. For an example, ARIMA (1,0,1) where p=1, d=0, q=1. The ARIMA (p,q,d) can be mathematically given by,
1-i=1p?iLi(1-L)tXt=1+i=1q?iLi?t…………………………. (3.11)
Where L is the lag operator, ?i are the parameter of AR part, ?i are the parameter od MA part.

Differencing is a transformation of the time-series data to make non-stationary to stationary. The difference between the present and the previous value is computed mathematically as
yt’=yt-yt-1………………………………………………………………. (3.12)
The change in the level of time series is removed by differencing. Sometimes the data has to undergo differencing second time to obtain a time series stationarity and it is called as second-order differencing.

yt*=yt’+yt-1’……………………………………………………………… (3.13)
yt*= (yt-yt-1)-(yt-1-yt-2)……………………………………………… (3.14)
yt*= yt-2yt-1+yt-2……………………………………………………… (3.15)
The ARIMA model can be extended with the addition of an independent variable that forms an exogenous data as one of the application. The ARIMA model is suitable for the forecast with least error as possible as the electrical load is varying with time.
3.2 LOAD FORECAST USING ARIMA MODEL IN R
3.2.1 Dataset considered
The data considered in this project is the load consumption in actual form from the meter of the station. The load consumption data is of the interval of one minute. The time duration of the dataset is divided into two sets, a one day, which consists of 1440 samples of data and one week, which consist of 10080 samples of data which is processed in R software for the forecast. Another dataset of meteorological value ie, the temperature is considered for one day and one week. The dataset considered are used for the comparison how the weather factor will have an influence on load forecasting by means of error.

3.2.1 Algorithm for time series load forecasting ARIMA model in R
The time series of load forecasting of load data using ARIMA model in R software involves the following step.

Step-1: Start.

Step-2: All the required forecasting packages to develop the ARIMA model is loaded to R-Tool.

Step-3: The dataset is imported one by one to the R-Tool.

Step-4: The data is passed into a newly created variable.

Step-5: One-day load data is divided into train data with first 1200 samples and rest 240 values as test data and for one week the first 9080 samples as train data and rest 1000 samples as test data.

Step-6: Similarly the temperature data is divided into train and test data for both one day and one week.

Step-7: The function Auto.arima is used for fitting the above-mentioned set of data. Which calculate the suitable parameter of ARIMA like p,d, and q based on the stationarity of data.

Step-8: The Auto.arima has the provision to consider the independent variable as an exogenous variable which in the temperature data. For the comparison of an accuracy of the forecast, two conditions are considered ie, without consideration of the temperature data and another time with consideration of the temperature data.

Step-9: By considering above condition and sets of data the forecast is carried out.

Step-10: The Root Mean Square Error (RMSE) between the forecasted value and the actual value is obtained which defines the accuracy of the forecast.

Step-11: Graphs of the forecasted values as a function of time series are plotted for both one day and one week with and without an independent variable consideration.

Step-12: Stop.

3.2.2 Flowchart of the proposed algorithm
2371725571500000381005076190Graphs of load consumption for a considered sample period of time and forecasted value is plotted
00Graphs of load consumption for a considered sample period of time and forecasted value is plotted
2343150476250034194751943100343852526955753438525390525022764753009900Auto.arima function to fit the load data and temperature data
00Auto.arima function to fit the load data and temperature data
23526752219325Load the temperature data to xreg variable in Auto.arima
00Load the temperature data to xreg variable in Auto.arima
-285752552700Auto.arima function to fit the load data
00Auto.arima function to fit the load data
10763253209925107632519431002000251543050Import the load data samples and store in empty new variables
00Import the load data samples and store in empty new variables
22764751266825638175504825Load all the forecasting package to
develop ARIMA model in R-Tool
00Load all the forecasting package to
develop ARIMA model in R-Tool
22764751809751743075-142875Start
0Start
2000254180840Forecast the load horizon for both with and without
consideration of temperature data
00Forecast the load horizon for both with and without
consideration of temperature data

200025159385RMSE of the forecasted value and actual value is obtained to find out the accuracy of the forecast
00RMSE of the forecasted value and actual value is obtained to find out the accuracy of the forecast

2381250324485
1838325167640Stop
0Stop

Figure 3.1: Flowchart of the proposed algorithm.

CHAPTER 4
RESULTS AND DISCUSSION
This chapter gives the detailed study comparison error in forecasted data for short period of time using Time Series ARIMA model which helps in improving the forecast value for further use. The forecast of load data is computed using the forecast package in R Software.

4.1 Results
The results obtained for both one day and one week by the ARIMA model fitting and the forecast of the future load as shown in this section.

4.1.1 Data samples considered for One-day
One day data samples with a sequence of the time interval of one minute are considered for the forecast using ARIMA model in R-Tool. The first 1200 data samples are taken as the training data whereas the next 240 data are taken as testing data to compare the forecasted value. Figure 4.1 gives the clear picture of the variation in load consumption throughout the day. The load curve shows that the load consumption during the period of 10 am to 6 pm is less compared to rest of time.

Figure 4.1: Load curve of one day.

Figure 4.2: Forecast value without weather data.

Figure 4.2 is the forecasted graph without considering the weather parameter by the ARIMA model with the parameter considered by the auto.arima function. The AR(p) is 5, I(d) is 1, and MA(q) is 1. These parameters are selected by the function automatically by calculating the value of Auto-Regression, difference and Moving Average for the given set of data based on the degree of stationarity. The calculation of the ARIMA parameters are,
ARIMA(5,1,1)
Coefficients:
ar1 ar2 ar3 ar4 ar5 ma1
0.3224 -0.0005 0.0233 0.1210 0.0731 -0.3527
s.e. 0.2017 0.0301 0.0301 0.0304 0.0451 0.2016
sigma^2 estimated as 36.33: log likelihood=-3852.18
AIC=7718.36 AICc=7718.46 BIC=7753.99
Where the log likelihood is the logarithm of the probability of the data coming from the estimated model. The Akaike’s Information Criterion (AIC) is used to select regression to determine the order of ARIMA model. AICc is the corrected AIC and BIC is the Bayesian Information Criterion. To obtain good model AIC, AICc or BIC either should be minimized.

Figure 4.3: Temperature-time characteristics of one day considered for load forecasting.

Figure 4.3 is the plot of ambient temperature v/s time which is considered for load forecasting of one day.

Fig4.4: Forecast value with weather data.

Figure 4.4 is the plot of the forecast when the ambient temperature data is considered for fitting using auto.arima function. When the temperature data is considered the ARIMA parameter p,d, and q are found out to be 5,1, and 3 respectively. The calculation of ARIMA parameter for temperature data considered is as,
ARIMA(5,1,3)
Coefficients:
ar1 ar2 ar3 ar4 ar5 ma1 ma2 ma3
0.3891 -0.4502 0.6841 0.0801 0.1052 -0.4203 0.4515 -0.6845
s.e. 0.1279 0.1121 0.0965 0.0335 0.0393 0.1263 0.1120 0.0966
xreg -0.2160
s.e. 0.8422
sigma^2 estimated as 36.19: log likelihood=-3848.32
AIC=7716.65 AICc=7716.83 BIC=7767.54
Where xreg is the independent variable parameter of ARIMA function. The temperature value is passed to the xreg variable.

4.1.2 Data samples considered for One-week
One-week data samples with a sequence of the time interval of one minute are considered for the forecast using ARIMA model in R-Tool. The first 9080 data samples are considered as the training data and next 1000 are considered as testing data to compare with the forecasted value. Figure 4.5 is the load curve of the one week. The data considered are from Tuesday to next Monday. The load curve shows the variation of load consumption throughout the week the load between time samples 5000 to 7000 is the weekend and the load consumption is less compared to the weekdays. The load pattern of a week also has the influence on the load forecasting.

Figure 4.5: load curve of a week.

Figure 4.6: Forecast value without weather data.

Figure 4.6 is the plot of the forecasted value obtained from the data samples considered for one week without consideration of temperature data. The auto.arima function calculates the ARIMA parameters p,d, and q is 1,1, and 1 respectively and fit the data for the forecast. The order of Auto-Regressive (AR) is 1, the differencing degree parameter is 1 and the order of Moving Average (MA) is 1. These values are obtained based on the stationarity of the data considered.
ARIMA(1,1,1)
Coefficients:
ar1 ma1
0.8451 -0.8584
s.e. 0.1185 0.1150
sigma^2 estimated as 21.97: log likelihood=-26908.15
AIC=53822.3 AICc=53822.3 BIC=53843.64

Figure 4.7: Temperature-time characteristics of one week considered for load forecasting.
Figure 4.7 is the plot of ambient temperature v/s time which is considered for load forecasting of one week.

Figure 4.8: forecast value with weather data.

Figure 4.8 is the plot of forecast value when the temperature value is considered for the forecast. The auto.arima function will help to determine the parameter for ARIMA model. When the temperature is considered the ARIMA parameter are found out to be 1,1, and 1. ie, p=1, d=1, and q=1. The calculation is given as,
ARIMA(1,1,1)
Coefficients:
ar1 ma1 xreg 0.8435 -0.8571 0.8923
s.e. 0.1192 0.1155 0.2190sigma^2 estimated as 21.94: log likelihood=-26900.15
AIC=53808.3 AICc=53808.31 BIC=53836.76
At the starting of this section, it is seen that the minimizing the AIC, AICc, and BIC will develop the good model. It can be seen that the AIC, AICc, and BIC of one-week data is less compared to one-day. The AIC, AICc, and BIC are dependent on the likelihood of the data which is directly dependent on data samples. Hence the proper amount of the data samples should be considered. Hence to obtain the more accurate forecast the data considered must be sufficient.

4.2 Discussions
Short-Term Load Forecasting is carried out with time periods ranging from minutes to weeks, which are used for ensuring the stability of the system, scheduling, control of the power system, security analysis and input for Contingency Analysis. Hence this forecast has to be accurate as it is used for the important function of the electrical power system. The study of comparison of RMSE gives an idea of how the error gets decreased when the independent variable is considered. By computing forecast using ARIMA model it is found that as the number of samples considered there is a decrease in error. The table gives the RMSE obtained for the cases carried out.

Cases RMSE
1. One day without temperature data 6.009955
2. One day with temperature data 5.990357
3. One week without temperature data 4.68697
4. One week with temperature data 4.682843
Table 4.1: Root Mean Square Error for the forecasted load in different conditions.

The RMSE values obtained for one day without an independent variable i.e temperature is 6.009955 and whereas with temperature data, it is found to be 5.990357. It shows that the forecast error can be decreased by considering the independent variable and it has an effect on forecasted load and can provide more accuracy. When one-week data is considered the RMSE was found out to be 4.68697 which is less compared to one day RMSE which is 6.009955 which clears that the number of samples considered will have the effect on load forecasting. The number of samples the more accuracy can be achieved. When the temperature data is considered along one-week load consumption data it is found the RMSE is decreased slightly. The RMSE should be as less as possible to achieve good accuracy. More the accuracy it can be helpful to carry out the power system operation in a smoother way. If the accuracy of the forecast is poor then maintenance of the stability of the system is difficult as the load is increasing day by day and generation has to meet the load demand. The accurate forecast can help to schedule and control of power system.
CHAPTER 5
CONCLUSION AND FUTURE SCOPE
The Time series forecasting for a short time interval of electrical load data is done in this project. The different cases carried out gives the clear picture how that the independent variable will improve the accuracy of the forecast.

The time series forecasting carried out by fitting the ARIMA model for the available data and predict the future load. It is seen from the table 4.1 that the error in forecasting will be reduced when the weather i.e ambient temperature is considered which one of the main factors affecting the load forecast. it is also seen that the forecast is dependent on the load samples considered. When the more constant load pattern is taken still more accuracy can be achieved.

The forecasting of electrical load plays an important role in power system activities as it predicts the nearly accurate value of the future values. By keeping those values as the reference, it is possible to achieve a proper operating power system. The utilities can communicate with the consumers to reduce the load consumption in advance so that they can keep a balance between generation and load consumption. As the data considered plays a major role it is important to extract and use the data as an input to the forecast. The proper technique has to be considered for the extraction of data. The advancement in smart meter helps in proper storage of the data and easy to communicate to get the access to data for further use. Similarly, when the independent variable is considered mainly like temperature many advance meters are developed to provide proper values of temperature and other parameters.

With the aim to extract, process, and analyze the data available in large quantity than earlier time for load forecasting of electrical load data. There are many other independent variables which have an effect on load forecasting has to be considered. An online real-time algorithm that considers all type of independent variable which in turn increase the accuracy of the forecast would be useful for the betterment of the electrical system to a greater extent.

REFERENCE
1 Heiko Hahn, Silja Meyer-Nieberg, Stefan Picki, “Electric load forecasting methods: tools for decision making” in European Journal of Operational Research 199 (2009) 902–907
2 Arunesh Kumar Singh, Ibraheem, S. Khatoon, Md. Muazzam, D. K. Chaturvedi, “Load Forecasting Techniques and Methodologies: A Review” in 2nd International Conference on Power, Control and Embedded Systems 2012.

3 Shahida Khatoon, Ibraheem, Arunesh Kr. Singh, Priti, “Effect of various factor on electric load forecasting: an overview” in 2014 IEEE.

4 Muhammad Usman Fahad and Naeem Arbab, “Factor Affecting Short-Term Load Forecasting” in Journal of Clean Energy Technologies, Vol. 2, No. 4, October 2014.

5 A.D Papalexpoulos and T.C. Hesterberg,” A regression-based approach for short term load forecasting”, IEEE transaction on power system, vol. 5,1990.

6 M.T. Hagan and S.M. Behr, “The time series approach to short term load forecasting: a review and evaluation”, IEEE transaction on power system, Vol. 16, 2001
7 H.S. Hoppert, C.E. Pedreira and R.C. Souza,” Neural networks for short-term load forecasting: a review and evaluation”, IEEE transactions on power systems, Vol. 16, 2001
8 D.K. Ranaweera, N.F. Hubele and G.G. Karady,” Fuzzy logic for short-term load forecasting”, International journal of electrical power and energy system, Vol. 18, 1996
9 B.J Chen, M.W Chang and C.J. Lin, “Load forecasting using support vector machines: a study on EUNITE competition 2001″, IEEE transaction on power system, Vol. 19, 2004
10 A. Khotanzad, R. AfkhamiRohani, L.Tsun Liang, A. Abaye, M. Davis and D. Maratatukulam, ” ANNSTLF- a neural network based electric load forecasting system” , IEEE transaction on neural network, Vol. 8, 1997
11 A. Khotanzad, R. AfkhamiRohani and D. Maratatukulam, “ANNSTLF- a neural network short-term load forecasting”, IEEE transaction on power systems, Vol. 13,1998
12 H.S. Hippert and C.E. Pedreira, “Estimating temperature profiles for sort-term load forecasting: neural network compared to linear models”, IEEE proceedings in generation, transmission and distribution, Vol. 151, 2004
13 A. Khotanzad, M.H. Davis, A. Abaye and D.J Maratatukulam, “An artificial neural network hourly temperature forecaster with application in load forecasting”, IEEE transaction on power system, Vol. 11, 1996.

14 K. Methaprayoon, W.J. Lee, S. Rasmiddatta, J.R. Liao and R.J Ross, “Multistage artificial neural network short-term load forecasting engine with front-end weather forecast”, IEEE transaction on industry application, Vol. 42, 2007
15 G.E. Huck, A.A. Mahmoud, R.B. Comerford, J. Adams and E. Dawson, “Load forecast bibliography phase I” IEEE transaction, Vol. PAS-99, 1980
16 M.S. Sachdev, R. Billinton and C.A. Peterson, “Representative bibliography on load forecasting” IEEE transaction, Vol. 96, 1980
17 A.A. Mahmoud, T.H Ortmeyer and R.E. Reardon. “Load forecast bibliography phase II” IEEE transaction, Vol. PAS-100, 1981
18 M.A, Abu-El-Magd and N.K. Sinha,” Short-term load demand modeling and forecast: a review”, IEEE transaction vol 12,1982.

19 G. Gross and F.D. Galiana, “Short-term load forecasting”, IEEE proceeding, Vol.75, 1987
20 I. Moghram, S. Rahman,” Analysis and evaluation of five short-term load forecasting techniques”, IEEE transaction on power system, vol. 4, 1989
21 J.W. Taylor and P.E. McSharry,” Short-term load forecasting methods: an evaluation based on European data”, IEEE transaction on power system, vol. 22, 2007
22 A.D Papalexpoulos, H. Shangyou and P. Tie-Mao, “An implementation of neural network-based load forecasting model for the EMS”, IEEE transaction on power system, vol. 9, 1994
23 S. Rahman, “Formulation and analysis of rule-based short-term load forecasting algorithm”, IEEE proceedings, Vol. 78, 1990.

24 S. Fan, K. Methaprayoon, W.J. Lee, “Multiregion load forecasting for dyatem with large geographical area”, IEEE transaction, Vol. 45, 2009
25 K.B. Song, S.K. Ha, J.W. Park, D.J. Kweon, K.H. Kim, “Hybrid load forecasting methods with analysis of temperature sensitivities”, IEEE transaction on power system, vol. 10, 1995
26 B. Krogh, E.S. de Llinas and D. Lesser, “Design and implementation of an online load forecasting algorithm”, IEEE transaction, Vol PAS-101, 1982
27 N. Amjady, “Short-term hourly load forecasting using time-series modeling with peak load estimation capability”, IEEE transaction on power system, vol.16,2001
28 S.J Kiartzis, C.E Zoumas, J.B Theocharis, A.G. Bakirtzis and V. Petridis, “Short-term load forecasting in an autonomous power system using artificial neural networks”, IEEE transactions, Vol. 12, 1997
29 M. Espinoza, J.A.K. Suykens, R. Belmans and B. De Moor, “Electric load forecasting” IEEE control systems magazine, Vol. 27, 2007