Loading...
Thumbnail Image

Theses

Browse

Search Results

Now showing 1 - 7 of 7
  • ThesisItemOpen Access
    Variable selection for classification and discrimination of Indian Mustard (Brassica juncea) genotypes for yield and oil content
    (CCSHAU, Hisar, 2019-07-10) Godara, Poonam; Hooda, BK
    The present study deals with the problem of variable selection for classification and discrimination of Indian Mustard (Brassica juncea) genotypes for yield and oil content. The study used secondary data on 310 Indian mustard genotypes obtained from Oilseeds section of the department of Genetics and Plant Breeding, CCS HAU, Hisar. The experiment was conducted during rabi season of 2015-16. Five variable selection methods (Univariate Two-Sample t-test, Rao´s F test for Additional Information, STEPDISC Procedure (backward and forward) using Wilk´s Lambda criterion and Random Forests Algorithm) for classification and discrimination were compared using Monte Carlo simulation. Performance of the methods was assessed in terms of leave one out cross validation error for classification. Comparing the performance of various methods affecting seed yield for samples of equal sizes in scheme I, Rao's F test, Wilkˊs lambda (Backward) and Wilkˊs lambda (Forward) were found better than others. In scheme II, the most suitable methods affecting oil content with least leave one out cross validation error rate were Wilkˊs lambda (Backward) and Wilkˊs lambda (Forward). Based on results of the scheme I and II, Wilk´s Lambda (backward and forward) were found most suitable method for classification affecting the seed yield and oil content significantly. In scheme I using leave one out cross validation error rate four important variables for discrimination affecting the seed yield per plants were secondary branches, primary branches, days to maturity and siliqua number on main shoot with least error of rate of 21.72 per cent. The important variables for discrimination which significantly affected the oil content were siliqua length, Secondary branches, primary branches and days to maturity with least error rate of 33.90 per cent. Secondary branches, siliqua number on main shoot, seeds per siliqua and 1000 seed weight were found to be important variables in scheme III with least error rate of 27.68 per cent. Three characters which discriminate the groups having low seed yield and high seed yield were 1000 seed weight, siliqua length and seeds per siliqua, while siliqua length 1000 seed weight and primary branches were found the most discriminating variables affecting oil content. Using the correlation between variables and discriminant score, the most important variables affecting the seed yield were secondary branches, primary branches and days to maturity. The three most important variables discriminating between oil content were siliqua length, secondary branches and seeds per siliqua. Most important variables discriminating between low seed yield with low oil content and high seed yield with high oil content groups were secondary branches, primary branches and siliqua number of main shoot. The variable, number of secondary branches have been found to be the most important for classification and discrimination of Indian mustard genotypes for seed yield and oil content.
  • ThesisItemOpen Access
    Multidimensional analysis of poverty in Haryana: A fuzzy set approach
    (CCSHAU, 2018) Tanwar, Nitin; Hooda, B.K.
    The present investigation was carried out to measure aspect based multidimensional poverty in Haryana. The necessary data for the study was obtained from the consumer expenditure survey (68th round conducted in 2011-12 and 69th round conducted in 2012) of NSSO on drinking water, sanitation, hygiene and housing conditions. The Multidimensional Poverty Index (MPI) suggested by Alkire & Foster (2011) using the dual cut-off method based on the counting approach has been applied for poverty estimation in rural and urban areas of Haryana. The Totally Fuzzy and Relative Approach due to Costa and Angelis (2008) have been used to measure multidimensional poverty in Haryana. Univariate techniques for poverty measurement such as Head Count Ratio (HCR), Income Gap Ratio (IGR) and Poverty Gap Ratio (PGR) based on monetary data have also been used to estimate the proportion of deprived households at district levels in Haryana. The HCR indicated that the districts of Mewat and Fatehabad have maximum proportion of the poor households in rural Haryana while, the districts of Mewat and Yamuna Nagar have the maximum proportion of poor households in urban Haryana. The districts of Jhajjar, Gurgaon, Sonipat and Karnal have the minimum proportion of the poor households in rural Haryana while the districts of Hisar, Fatehabad and Gurgaon have the minimum proportion of poor households in urban Haryana. The maximum PGR has been observed in the districts of Fatehabad, Yamuna Nagar and Mewat in rural Haryana while the urban households in the districts of Mewat and Yamuna Nagar have the maximum poverty gap ratio. The fuzzy MPI based on the aspects of drinking water facilities, sanitation facilities and housing conditions for Haryana indicated that 33.28% households in overall Haryana are multidimensionally poor with 36.64% households in rural and 30.46% in urban Haryana. The decomposition of the households by social groups indicated that there is not much difference in multidimensional poverty index values among households related to schedule castes (SC), other backward classes (OBC) and others. The index values varied from 30.49 to 34.24 per cent among the social groups. Using the Alkire-Foster aspect based MPI, it was observed that the rural households in the districts of Mewat, Panipat, Mahendragarh, Rohtak, Gurgaon and Palwal have high MPI values indicating high level of poverty or deprivedness in these districts. Similarly the households in urban areas of the districts of Mewat, Panipat, Jhajjar, Rohtak and Mahendragarh were found multidimensionally poor as indicated by high MPI values.
  • ThesisItemOpen Access
    Predictive modelling with constant and varying coefficients over time for wheat yield in Haryana
    (CCSHAU, 2017) Salinder; Verma, Urmil
    Statistics plays an important role in all fields of life and the application of statistical techniques are numerous. Regression models using time series data occur quite oftenly, however, the assumption of uncorrelated or independent errors for time series data is often not appropriate. It is common to find response variables which do not fit the standard assumptions of the linear model. Generalized linear models expand the well-known linear model to accommodate non-normal response variables. One such extension is the class of varying coefficient models. In these models, the response variable is allowed to depend linearly on some regressors, with coefficients as smooth functions of some other predictor variables, called the effect modifiers. A special case of the varying coefficient model is given for time series data, where the effect modifier variable usually is calendar time and hence resulting in time-varying coefficient models. The statistical modelling approaches viz., multiple linear regression, linear discriminant function and linear mixed effects were applied to achieve district-level wheat yield estimation on agro-climatic zone basis in Haryana. The DOA wheat yield data for the period 1980-81 to 2014-15 of Ambala, Kurukshetra, Rohtak, Karnal, Jind, Sonipat, Gurgaon, Faridabad, Mahendragarh, Hisar, Sirsa and Bhiwani districts, 1989-90 to 2014-15 of Yamunanagar, Panipat, Kaithal and Rewari, 1995-96 to 2014-15 of Panchkula, 1997-98 to 2014-15 of Jhajjar and Fatehabad and 2006-07 to 2014-15 of Mewat district(s) were used in obtaining trend yield. The fortnightly weather data of Hisar, Ambala, Karnal, Rohtak, Gurgaon and Bawal were used for the purpose. The zonal wheat yield forecast models have been developed on the basis of time-trend and weather data from 1980-81 to 2009-10 while the data from 2010-11 to 2014-15 were used for validation of the developed models. Yield/time variables were included to take care of variation between districts within zone as the weather data were not available for all the districts, though the zonal model utilized the same weather information in adjoining districts under the zone so that a longer series could be obtained in a relatively shorter period and that also provided the basis to use advanced statistical techniques. For quantitative forecasting, zonal wheat yield models were fitted by taking fortnightly weather data (regression analysis) and discriminant/weather scores (discriminant analysis) along with trend yield as regressors and DOA wheat yield as regressand. Alternatively, the linear mixed effects models with random time effects at district and zone level and random time/weather effects at intercept with different covariance structure were tried. The performance of fitted models were decided on the basis of statistic(s) like AIC, BIC and log likelihood etc.. The predictive performance(s) of the contending models were observed in terms of percent deviations of wheat yield forecasts in relation to the observed yield(s) and root mean square error(s) as well. The linear mixed effects i.e. varying coefficients models performed well with lower error metrics as compared to the alternative models in most of the time regimes. Five-steps ahead forecast figures i.e. 2010-11 to 2014-15 favour the use of varying coefficient models to obtain pre-harvest wheat yield prediction in Haryana. The overall results indicate the preference of using varying coefficients models in comparison to conventional i.e. constant coefficients models for this empirical study. In addition, the developed models are capable of providing the reliable yield estimates well in advance of the crop harvest while on the other hand, the DOA yield estimates are obtained quite late after the actual harvest of the crop.
  • ThesisItemOpen Access
    ARIMA, state space and mixed modeling for sugarcane yield prediction in Haryana
    (CCSHAU, 2017) Suman; Verma, Urmil
    Forecasting of crop production is one of the most important aspects of agricultural statistics system. Crop production forecasting comprises crop identification, area estimation and predicting the yield of the crop. Understanding the behaviour of crop yields becomes increasingly important for modeling production functions, forecasting price movements and understanding the farmers’ responses to government programs. The statistical modeling approaches viz., ARIMA, state space and linear mixed modeling were used to achieve the district-level sugarcane yield estimation in major mustard growing districts of Haryana. The time-series sugarcane yield data for the period 1960-61 to 2009-10 of Karnal and Ambala districts, 1972-73 to 2009-10 of Kurukshetra district and 1980-81 to 2009-10 of Panipat and Yamunanagar districts were used for the development of different models. The selected models have been validated using the data on subsequent years i.e. 2010-11 to 2014-15, not included in the development of yield forecast models. After experimenting with different lags of moving average and autoregressive processes; ARIMA(0,1,1) for Karnal and Ambala districts and ARIMA(1,1,0) for Kurukshetra, Panipat and Yamunanagar districts were fitted. The underlying parameters of ARIMA models are assumed to be constant however the data in agriculture are generally collected over time and thus have the time-dependency in parameters. State space procedure giving time varying parameters models allow for known changes in the structure of the system over time. Thus, the same time series data were analyzed to achieve sugarcane yield estimates for the same five post-sample years using state space procedures by the application of Kalman filtering technique. Lastly, the linear mixed models with time both as fixed and random effects using different types of covariance structures viz., VC, AR(1) and Toeplitz were developed for sugarcane yield predition in the targeted districts. Finally, the performance of fitted models were decided on the basis of statistic(s) like AIC, BIC and log likelihood etc. Thus, the sugarcane yield estimates for the post-sample years 2010-11 to 2014-15 were obtained on the basis of fitted ARIMA, state space and linear mixed models. The predictive performance(s) of the contending models were observed in terms of percent deviations of sugarcane yield forecasts in relation to the observed yield(s) and root mean square error(s) as well. The state space models performed well with lower error metrics as compared to the alternative models in all time regimes i.e. these models consistently showed the superiority over ARIMA and linear mixed models in capturing percent relative deviations. In addition, the developed models are capable of providing the reliable yield estimates well in advance of the crop harvest while on the other hand, the DOA yield estimates are obtained quite late after the actual harvest of the crop.
  • ThesisItemOpen Access
    Probability Models for Spatial and Temporal Distributions of Rainfall in Haryana
    (CCSHAU, 2015) Bhushana Babu, V.; Hooda, B.K.
    The present investigation was carried to study probability models for spatial and temporal distributions of rainfall in Haryana. Various probability distributions describing daily, weekly and monthly rainfall behavior in Haryana were applied. The daily rainfall data of 34 years (1971 to 2005) were used covering 42 rainfall stations across Haryana. Exponential, gamma, Gumbel, lognormal and Weibull distributions were fitted to daily, weekly and monthly rainfall. Maximum Likelihood (ML) method was used for estimating parameters of the probability distributions. Kolmogorov-Smirnov (KS), Anderson-Darling (AD) and Chi-Square tests were used to test the goodness of fit of the fitted distributions. It was found that there is no single distribution that described the rainfall pattern of all the stations. However at most of the stations lognormal distribution was found to be best fit for daily rainfall based on KS and AD tests while Gumbel distribution was found to be best fit for weekly and monthly rainfall based on KS and Chi-Square tests. Multi-Criteria Decision Approach (MCDA) based on fuzzy majority approach was used for selection of best statistical distribution among Exponential, Gamma, Lognormal and Weibull distribution for describing daily, weekly and monthly rainfall. Lognormal distribution was found to be best fitting distribution to describe daily rainfall while gamma distribution was found to be best fitting distribution to describe weekly and monthly rainfall in various districts of Haryana.
  • ThesisItemOpen Access
    Prediction of milk production using penalized regression techniques in cattle
    (CCSHAU, 2013) Hemant Kumar; Hooda, B.K.
    Multiple linear regression models (MLR) have been widely used in dairy sciences to predict lifetime milk production in cattle on the basis of lactation traits. MLR often gives unsatisfactory results in the presences of high multicollinearity among the explanatory variables. Choice of functional form and selection of xplanatory variables is also important for getting a parsimonious and useful model for explaining any phenomenon. In the presence of multicollinearity and model mis-specification ordinary least square estimators of regression parameters generally have low bias and large variances resulting poor predictive performance. Keeping in view the presence of multicollinearity in mind shrinkage and penalized regression techniques have been used along with the artificial neural network for prediction of lifetime milk production on the basis of lactation traits. In the present study lactation traits such as previous lactation yield, age at first calving, lactation length, calving interval, service period, and dry period have been used for prediction of lifetime milk yield in crossbred cattle data. The lifetime milk production has been defined as total amount of milk produced by cattle from initiation of first lactation till the completion of third lactation. Small eigen values of correlation matrix of predictor variables, high value of variance inflation factor and higher condition index indicated presence of multicollinearity in crossbred cattle data. Consequently biased and penalized regression models have been adopted to take care of multicollinearity among the predictors. In addition to ridge regression the relatively recent techniques of penalized regression called LASSO and elastic net given by Tibshirani (1996) and Zou and Hastie (2005) respectively were also applied for developing prediction model for lifetime milk production and selection of principal lactation traits. On the basis of AIC and BIC values LASSO and elastic net outperformed the ridge regression and elastic net techniques was found most satisfactory. Forward selection, backward elimination, LASSO and elastic net were used for selection of best subset of lactation traits for prediction of lifetime milk production. It was observed that seven variables out of eleven were selected by LASSO and six by elastic net using optimal value of regularization parameters. The optimum value of regularization parameters was computed using 10- fold cross validation. The number of traits in best subset was found to six for backward elimination and four for forward selection method. On the basis of adjusted R2, AIC and BIC values and simplicity of the model it was concluded that subset selected by LASSO techniques having just two significant traits was best. Evaluation of predicting performance of multiple regression, ridge regression, LASSO, elastic net and ANNs models has been done by dividing the sample under study into two sets, by taking 90% observations in training set and 10% observations on test set. Coefficient of determination, root mean square error, mean absolute error, mean absolute percentage error and Theil’s U-statistics were computed for the test set, and based on these performance measures elastic net was found most satisfactory techniques for prediction of lifetime milk yield using lactation traits in crossbred cattle.
  • ThesisItemOpen Access
    Structural Equation Modeling With Latent Variables For Assessment Of Regional Development In Haryana
    (College Of Basic Sciences And Humanities Chaudhary Charan Singh haryana Agricultural University : Hisar, 2010) Sheoran,Parkash.Om.; Rai,Lajpat.