CASdatasets usage in research articles and books

Claim severity

Claim frequency

Risk measure

Pricing

Extreme value analysis

Multivariate and copula models

Bayesian analysis

Author

Christophe Dutang

Published

Invalid Date

This vignette lists the published papers or books using datasets of CASdatasets package. References are ordered chronologically.

General usage and/or review papers

[1] P. Embrechts and M. V. Wüthrich. “Recent challenges in actuarial science”. In: Annual Review of Statistics and Its Application 9 (2022), pp. 119–140. DOI: 10.1146/annurev-statistics-040120-030244.

[2] A. Mashrur, W. Luo, N. A. Zaidi, et al. “Machine learning for financial risk management: a survey”. In: Ieee Access 8 (2020), pp. 203203–203223. DOI: 10.1109/ACCESS.2020.3036322.

[3] M. V. Wüthrich and M. Merz. Statistical foundations of actuarial learning and its applications. Springer Nature, 2023. DOI: 10.1007/978-3-031-12409-9.pdf.

Claim severity modeling

[1] H. Alsuhabi. “The new Topp-Leone exponentied exponential model for modeling financial data”. In: Mathematical Modelling and Control 4.1 (2024), p. 44. DOI: 10.3934/mmc.2024005.

[2] S. A. Bakar and S. Nadarajah. “Composite models with underlying folded distributions”. In: Journal of Computational and Applied Mathematics 390 (2021), p. 113351. DOI: 10.1016/j.cam.2020.113351.

[3] A. Chaturvedi, S. R. Bapat, and N. Joshi. “Sequential estimation of an inverse Gaussian mean with known coefficient of variation”. In: Sankhya B 84.1 (2022), pp. 402–420. DOI: 10.1007/s13571-021-00266-x.

[4] D. Chevalier and M. Côté. “From point to probabilistic gradient boosting for claim frequency and severity prediction”. In: European Actuarial Journal 15.3 (2025), pp. 707–752. DOI: 10.1007/s13385-025-00428-5.

[5] S. Ghaddab, M. Kacem, C. de Peretti, et al. “Extreme severity modeling using a GLM-GPD combination: application to an excess of loss reinsurance treaty”. In: Empirical Economics 65.3 (2023), pp. 1105–1127. DOI: 10.1007/s00181-023-02371-4.

[6] F. Holvoet, K. Antonio, and R. Henckaerts. “Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff”. In: North American Actuarial Journal 29.3 (2025), pp. 519–562. DOI: 10.1080/10920277.2025.2451860.

[7] M. A. Meraou, N. M. Al-Kandari, M. Z. Raqab, et al. “Analysis of skewed data by using compound Poisson exponential distribution with applications to insurance claims”. In: Journal of Statistical Computation and Simulation 92.5 (2022), pp. 928–956. DOI: 10.1080/00949655.2021.1981324.

[8] L. Möstel, M. Fischer, and M. Pfeuffer. “Composite Tukey-type distributions with application to operational risk management”. In: Journal of Operational Risk (2024). DOI: 10.21314/JOP.2023.010.

[9] G. Pittarello, M. Hiabu, and A. M. Villegas. “Replicating and extending chain-ladder via an age–period–cohort structure on the claim development in a run-off triangle”. In: North American Actuarial Journal 30.1 (2026), pp. 1–31. DOI: 10.1080/10920277.2025.2496725.

[10] N. Počuča, P. Jevtić, P. D. McNicholas, et al. “Modeling frequency and severity of claims with the zero-inflated generalized cluster-weighted models”. In: Insurance: Mathematics and Economics 94 (2020), pp. 79–93. DOI: 10.1016/j.insmatheco.2020.06.004.

[11] A. Punzo. “A new look at the inverse Gaussian distribution with applications to insurance and economic data”. In: Journal of Applied Statistics 46.7 (2019), pp. 1260–1287. DOI: 10.1080/02664763.2018.1542668. eprint: https://doi.org/10.1080/02664763.2018.1542668. URL: https://doi.org/10.1080/02664763.2018.1542668.

[12] M. Qazvini. “On the validation of claims with excess zeros in liability insurance: A comparative study”. In: Risks 7.3 (2019), p. 71. DOI: 10.3390/risks7030071.

[13] M. Raschke. “Alternative modelling and inference methods for claim size distributions”. In: Annals of Actuarial Science 14.1 (2020), pp. 1–19. DOI: 10.1017/S1748499519000010.

[14] S. D. Tomarchio, A. Punzo, J. T. Ferreira, et al. “Mode mixture of unimodal distributions for insurance loss data”. In: Annals of Operations Research (2024), pp. 1–19. DOI: 10.1007/s10479-024-06063-9.

Claim frequency modeling

[1] D. Chevalier and M. Côté. “From point to probabilistic gradient boosting for claim frequency and severity prediction”. In: European Actuarial Journal 15.3 (2025), pp. 707–752. DOI: 10.1007/s13385-025-00428-5.

[2] Ł. Delong, M. Lindholm, and M. V. Wüthrich. “Making Tweedie’s compound Poisson model more accessible”. In: European Actuarial Journal 11.1 (2021), pp. 185–226. DOI: 10.1007/s13385-021-00264-3.

[3] F. Holvoet, K. Antonio, and R. Henckaerts. “Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff”. In: North American Actuarial Journal 29.3 (2025), pp. 519–562. DOI: 10.1080/10920277.2025.2451860.

[4] Y. Liu, W. Li, and X. Zhang. “A marginalized zero-truncated Poisson regression model and its model averaging prediction: Y. Liu et al.” In: Communications in Mathematics and Statistics 13.3 (2025), pp. 527–570. DOI: 10.1007/s40304-022-00312-8.

[5] M. A. Meraou, N. M. Al-Kandari, M. Z. Raqab, et al. “Analysis of skewed data by using compound Poisson exponential distribution with applications to insurance claims”. In: Journal of Statistical Computation and Simulation 92.5 (2022), pp. 928–956. DOI: 10.1080/00949655.2021.1981324.

[6] M. A. Meraou, M. Z. Raqab, and F. B. Almathkour. “Analyzing insurance data with an alpha power transformed exponential Poisson model”. In: Annals of Data Science 12.3 (2025), pp. 991–1011. DOI: 10.1007/s40745-024-00554-z.

[7] J. Merupula, V. Vaidyanathan, and C. Chesneau. “Prediction Interval for Compound Conway–Maxwell–Poisson Regression Model with Application to Vehicle Insurance Claim Data”. In: Mathematical and Computational Applications 28.2 (2023), p. 39. DOI: 10.3390/mca28020039.

[8] N. Počuča, P. Jevtić, P. D. McNicholas, et al. “Modeling frequency and severity of claims with the zero-inflated generalized cluster-weighted models”. In: Insurance: Mathematics and Economics 94 (2020), pp. 79–93. DOI: 10.1016/j.insmatheco.2020.06.004.

[9] G. Willame, J. Trufin, and M. Denuit. “Boosted Poisson regression trees: a guide to the BT package in R”. In: Annals of Actuarial Science 18.3 (2024), pp. 605–625. DOI: 10.1017/S174849952300026X.

Risk measure

[1] A. G. Abubakari. “Actuarial measures, regression, and applications of exponentiated Fréchet loss distribution”. In: International Journal of Mathematics and Mathematical Sciences 2022.1 (2022), p. 3155188. DOI: 10.1155/2022/3155188.

[2] A. Y. Bin-Nun, C. Lizarazo, A. Panasci, et al. “What do surrogate safety metrics measure? Understanding driving safety as a continuum”. In: Accident Analysis & Prevention 195 (2024), p. 107245. DOI: 10.1016/j.aap.2023.107245.

[3] Y. Guan, Z. Jiao, and R. Wang. “A reverse ES (CVaR) optimization formula”. In: North American Actuarial Journal 28.3 (2024), pp. 611–625. DOI: 10.1080/10920277.2023.2249524.

[4] A. Staino, E. Russo, M. Costabile, et al. “Minimum capital requirement and portfolio allocation for non-life insurance: a semiparametric model with Conditional Value-at-Risk (CVaR) constraint: A. Staino et al.” In: Computational Management Science 20.1 (2023), p. 12. DOI: 10.1007/s10287-023-00439-1.

Pricing insurance

[1] A. Brauer. “Enhancing actuarial non-life pricing models via transformers”. In: European Actuarial Journal 14.3 (2024), pp. 991–1012. DOI: 10.1007/s13385-024-00388-2.

[4] M. Lindholm, F. Lindskog, and J. Palmquist. “Local bias adjustment, duration-weighted probabilities, and automatic construction of tariff cells”. In: Scandinavian Actuarial Journal 2023.10 (2023), pp. 946–973. DOI: 10.1080/03461238.2023.2176251.

[5] M. Lindholm and T. Nazar. “On duration effects in non-life insurance pricing”. In: European Actuarial Journal 14.3 (2024), pp. 809–832. DOI: 10.1007/s13385-024-00385-5.

[6] M. Lindholm and J. Palmquist. “Black-box guided generalised linear model building with non-life pricing applications”. In: Annals of Actuarial Science 18.3 (2024), pp. 675–691. DOI: 10.1017/S1748499524000265.

[8] M. Meraou, N. Al-Kandari, and M. Raqab. “Univariate and bivariate compound models based on random sum of variates with application to the insurance losses data”. In: Journal of Statistical Theory and Practice 16.4 (2022), p. 56. DOI: 10.1007/s42519-022-00282-8.

[9] R. Wang, H. Shi, and J. Cao. “A Nested GLM Framework with Neural Network Encoding and Spatially Constrained Clustering in Non-Life Insurance Ratemaking”. In: North American Actuarial Journal 29.3 (2025), pp. 645–661. DOI: 10.1080/10920277.2024.2442416.

[10] X. Xin and F. Huang. “Antidiscrimination insurance pricing: Regulations, fairness criteria, and models”. In: North American Actuarial Journal 28.2 (2024), pp. 285–319. DOI: 10.1080/10920277.2023.2190528.

Extreme value analysis

[1] M. Allouche, J. El Methni, and S. Girard. “A refined Weissman estimator for extreme quantiles”. In: Extremes 26.3 (2023), pp. 545–572. DOI: 10.1007/s10687-022-00452-8.

[2] S. Girard, G. Stupfler, and A. Usseglio-Carleve. “On automatic bias reduction for extreme expectile estimation”. In: Statistics and Computing 32.4 (2022), p. 64. DOI: 10.1007/s11222-022-10118-x.

[3] J. Meng and K. Chan. “Penalized quasi-likelihood estimation of generalized Pareto regression–consistent identification of risk factors for extreme losses”. In: Insurance: Mathematics and Economics 104 (2022), pp. 60–75. DOI: 10.1016/j.insmatheco.2022.01.005.

[4] G. Stupfler and A. Usseglio-Carleve. “Composite bias-reduced L p-quantile-based estimators of extreme quantiles and expectiles”. In: Canadian Journal of Statistics 51.2 (2023), pp. 704–742. DOI: 10.1002/cjs.11703.

Multivariate and copula models

[1] A. Brouste, C. Dutang, L. Hovsepyan, et al. “Fast inference in copula models with categorical explanatory variables using the one-step procedure”. In: Computational Statistics 41.1 (2026), p. 23. DOI: 10.1007/s00180-025-01692-5.

[2] N. W. Deresa, I. Van Keilegom, and K. Antonio. “Copula-based inference for bivariate survival data with left truncation and dependent censoring”. In: Insurance: Mathematics and Economics 107 (2022), pp. 1–21. DOI: 10.1016/j.insmatheco.2022.07.011.

[3] Q. Hoang, P. Khandelwal, and S. Ghosh. “Robust predictive model using copulas”. In: Data-Enabled Discovery and Applications 3.1 (2019), p. 8. DOI: 10.1007/s41688-019-0032-y.

[4] M. Meraou, N. Al-Kandari, and M. Raqab. “Univariate and bivariate compound models based on random sum of variates with application to the insurance losses data”. In: Journal of Statistical Theory and Practice 16.4 (2022), p. 56. DOI: 10.1007/s42519-022-00282-8.

[5] S. F. Syed Yusoff Alhabshi, Z. H. Zamzuri, and S. N. Mohd Ramli. “Monte carlo simulation of the moments of a copula-dependent risk process with weibull interwaiting time”. In: Risks 9.6 (2021), p. 109. DOI: 10.3390/risks9060109.

Bayesian analysis

[1] P. Goffard and P. J. Laub. “Approximate Bayesian Computations to fit and compare insurance loss models”. In: Insurance: Mathematics and Economics 100 (2021), pp. 350–371. DOI: 10.1016/j.insmatheco.2021.06.002.

[2] F. Ungolo and E. R. van den Heuvel. “A Dirichlet process mixture regression model for the analysis of competing risk events”. In: Insurance: Mathematics and Economics 116 (2024), pp. 95–113. DOI: 10.1016/j.insmatheco.2024.02.004.

General usage and/or review papers

Claim severity modeling

Claim frequency modeling

Risk measure

Pricing insurance

Extreme value analysis

Multivariate and copula models

Bayesian analysis

Other topics