中国科学院研究生院学报
中國科學院研究生院學報
중국과학원연구생원학보
JOURNAL OF THE GRADUATE SCHOOL OF THE CHINESE ACADEMY OF SCIENCES
2009年
2期
173-184
,共12页
零膨胀泊松回归模型%部分线性模型%Sieve极大似然估计%强相合%渐近有效
零膨脹泊鬆迴歸模型%部分線性模型%Sieve極大似然估計%彊相閤%漸近有效
령팽창박송회귀모형%부분선성모형%Sieve겁대사연고계%강상합%점근유효
zero-inflated Poisson model%partial linear models%Sieve maximum likelihood estimator%strongly consistent%asymptotically efficient
泊松回归模型常常用于计数数据的研究中,然而在实际数据中零值的比例可能远远大于泊松分布中取零值的概率,而且这些零值通常都有其特殊含义.此外计数数据可能是分组数据,即观测到的数据不是确切值而只是已知其落在某一个区间范围之内;或者某些特定的数据,例如工资,要先对它进行人为的分组然后再进行分析.考虑一种零膨胀泊松半参数回归模型来处理上述分组计数数据.该模型中泊松分布的期望与协变量之间采用部分线性连接函数,而零值的概率与协变量之间采用线性连接函数.利用Sieve极大似然估计方法来估计该回归模型中参数和非参数函数,并提出了一种得分检验方法来检验是否存在零膨胀.在一定正则条件下,获得了Sieve极大似然估计的渐近性质,证明了参数部分的估计是强相合,渐近正态及渐近有效的;同时非参数函数的估计达到了最优收敛速度.模拟研究表明,估计和检验方法效果都比较好,最后将此模型和推断方法应用于一组公共卫生领域实际数据研究.
泊鬆迴歸模型常常用于計數數據的研究中,然而在實際數據中零值的比例可能遠遠大于泊鬆分佈中取零值的概率,而且這些零值通常都有其特殊含義.此外計數數據可能是分組數據,即觀測到的數據不是確切值而隻是已知其落在某一箇區間範圍之內;或者某些特定的數據,例如工資,要先對它進行人為的分組然後再進行分析.攷慮一種零膨脹泊鬆半參數迴歸模型來處理上述分組計數數據.該模型中泊鬆分佈的期望與協變量之間採用部分線性連接函數,而零值的概率與協變量之間採用線性連接函數.利用Sieve極大似然估計方法來估計該迴歸模型中參數和非參數函數,併提齣瞭一種得分檢驗方法來檢驗是否存在零膨脹.在一定正則條件下,穫得瞭Sieve極大似然估計的漸近性質,證明瞭參數部分的估計是彊相閤,漸近正態及漸近有效的;同時非參數函數的估計達到瞭最優收斂速度.模擬研究錶明,估計和檢驗方法效果都比較好,最後將此模型和推斷方法應用于一組公共衛生領域實際數據研究.
박송회귀모형상상용우계수수거적연구중,연이재실제수거중령치적비례가능원원대우박송분포중취령치적개솔,이차저사령치통상도유기특수함의.차외계수수거가능시분조수거,즉관측도적수거불시학절치이지시이지기락재모일개구간범위지내;혹자모사특정적수거,례여공자,요선대타진행인위적분조연후재진행분석.고필일충령팽창박송반삼수회귀모형래처리상술분조계수수거.해모형중박송분포적기망여협변량지간채용부분선성련접함수,이령치적개솔여협변량지간채용선성련접함수.이용Sieve겁대사연고계방법래고계해회귀모형중삼수화비삼수함수,병제출료일충득분검험방법래검험시부존재령팽창.재일정정칙조건하,획득료Sieve겁대사연고계적점근성질,증명료삼수부분적고계시강상합,점근정태급점근유효적;동시비삼수함수적고계체도료최우수렴속도.모의연구표명,고계화검험방법효과도비교호,최후장차모형화추단방법응용우일조공공위생영역실제수거연구.
The incidence of zero counts is often greater than expected for the Poisson distribution and zero counts frequently have special status. And sometimes the count data may be grouped, which means that for some observation the count is not known exactly but is known to fall in a particular range. This paper considers a semiparametric zero-inflated Poisson (ZIP) model to fit such grouped data with excess zeros, where the partial linear link function is used in the mean of the Poisson distribution and the linear link function is used in modeling the probability of zero. A Sieve maximum likelihood estimator(MLE) is proposed to estimate both the regression parameters and the nonparametric function, and a score test is provided for the presence of excess zeros. Asymptotic properties of the proposed Sieve MLEs are discussed. Under some mild conditions, the estimators are shown to be strong consistent. Moreover, the estimators of the unknown parameters are asymptotic efficient and normally distributed. The estimator of the nonparametric function has optimal convergence rate. Simulation studies are carried out to investigate the performance of the proposed method. For illustration purpose, the method is applied to a data set from a public health survey.