CASdatasets usage in research articles and books

Claim severity
Claim frequency
Risk measure
Pricing
Extreme value analysis
Multivariate and copula models
Bayesian analysis
Author

Christophe Dutang

Published

Invalid Date

This vignette lists the published papers or books using datasets of CASdatasets package. References are ordered chronologically.

General usage and/or review papers

[1] P. Embrechts and M. V. Wüthrich. “Recent challenges in actuarial science”. In: Annual Review of Statistics and Its Application 9 (2022), pp. 119–140. DOI: 10.1146/annurev-statistics-040120-030244.

[2] A. Mashrur, W. Luo, N. A. Zaidi, et al. “Machine learning for financial risk management: a survey”. In: Ieee Access 8 (2020), pp. 203203–203223. DOI: 10.1109/ACCESS.2020.3036322.

[3] M. V. Wüthrich and M. Merz. Statistical foundations of actuarial learning and its applications. Springer Nature, 2023. DOI: 10.1007/978-3-031-12409-9.pdf.

Claim severity modeling

[1] H. Alsuhabi. “The new Topp-Leone exponentied exponential model for modeling financial data”. In: Mathematical Modelling and Control 4.1 (2024), p. 44. DOI: 10.3934/mmc.2024005.

[2] S. A. Bakar and S. Nadarajah. “Composite models with underlying folded distributions”. In: Journal of Computational and Applied Mathematics 390 (2021), p. 113351. DOI: 10.1016/j.cam.2020.113351.

[3] A. Chaturvedi, S. R. Bapat, and N. Joshi. “Sequential estimation of an inverse Gaussian mean with known coefficient of variation”. In: Sankhya B 84.1 (2022), pp. 402–420. DOI: 10.1007/s13571-021-00266-x.

[4] D. Chevalier and M. Côté. “From point to probabilistic gradient boosting for claim frequency and severity prediction”. In: European Actuarial Journal 15.3 (2025), pp. 707–752. DOI: 10.1007/s13385-025-00428-5.

[5] S. Ghaddab, M. Kacem, C. de Peretti, et al. “Extreme severity modeling using a GLM-GPD combination: application to an excess of loss reinsurance treaty”. In: Empirical Economics 65.3 (2023), pp. 1105–1127. DOI: 10.1007/s00181-023-02371-4.

[6] F. Holvoet, K. Antonio, and R. Henckaerts. “Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff”. In: North American Actuarial Journal 29.3 (2025), pp. 519–562. DOI: 10.1080/10920277.2025.2451860.

[7] M. A. Meraou, N. M. Al-Kandari, M. Z. Raqab, et al. “Analysis of skewed data by using compound Poisson exponential distribution with applications to insurance claims”. In: Journal of Statistical Computation and Simulation 92.5 (2022), pp. 928–956. DOI: 10.1080/00949655.2021.1981324.

[8] L. Möstel, M. Fischer, and M. Pfeuffer. “Composite Tukey-type distributions with application to operational risk management”. In: Journal of Operational Risk (2024). DOI: 10.21314/JOP.2023.010.

[9] G. Pittarello, M. Hiabu, and A. M. Villegas. “Replicating and extending chain-ladder via an age–period–cohort structure on the claim development in a run-off triangle”. In: North American Actuarial Journal 30.1 (2026), pp. 1–31. DOI: 10.1080/10920277.2025.2496725.

[10] N. Počuča, P. Jevtić, P. D. McNicholas, et al. “Modeling frequency and severity of claims with the zero-inflated generalized cluster-weighted models”. In: Insurance: Mathematics and Economics 94 (2020), pp. 79–93. DOI: 10.1016/j.insmatheco.2020.06.004.

[11] A. Punzo. “A new look at the inverse Gaussian distribution with applications to insurance and economic data”. In: Journal of Applied Statistics 46.7 (2019), pp. 1260–1287. DOI: 10.1080/02664763.2018.1542668. eprint: https://doi.org/10.1080/02664763.2018.1542668. URL: https://doi.org/10.1080/02664763.2018.1542668.

[12] M. Qazvini. “On the validation of claims with excess zeros in liability insurance: A comparative study”. In: Risks 7.3 (2019), p. 71. DOI: 10.3390/risks7030071.

[13] M. Raschke. “Alternative modelling and inference methods for claim size distributions”. In: Annals of Actuarial Science 14.1 (2020), pp. 1–19. DOI: 10.1017/S1748499519000010.

[14] S. D. Tomarchio, A. Punzo, J. T. Ferreira, et al. “Mode mixture of unimodal distributions for insurance loss data”. In: Annals of Operations Research (2024), pp. 1–19. DOI: 10.1007/s10479-024-06063-9.

Claim frequency modeling

[1] D. Chevalier and M. Côté. “From point to probabilistic gradient boosting for claim frequency and severity prediction”. In: European Actuarial Journal 15.3 (2025), pp. 707–752. DOI: 10.1007/s13385-025-00428-5.

[2] Ł. Delong, M. Lindholm, and M. V. Wüthrich. “Making Tweedie’s compound Poisson model more accessible”. In: European Actuarial Journal 11.1 (2021), pp. 185–226. DOI: 10.1007/s13385-021-00264-3.

[3] F. Holvoet, K. Antonio, and R. Henckaerts. “Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff”. In: North American Actuarial Journal 29.3 (2025), pp. 519–562. DOI: 10.1080/10920277.2025.2451860.

[4] Y. Liu, W. Li, and X. Zhang. “A marginalized zero-truncated Poisson regression model and its model averaging prediction: Y. Liu et al.” In: Communications in Mathematics and Statistics 13.3 (2025), pp. 527–570. DOI: 10.1007/s40304-022-00312-8.

[5] M. A. Meraou, N. M. Al-Kandari, M. Z. Raqab, et al. “Analysis of skewed data by using compound Poisson exponential distribution with applications to insurance claims”. In: Journal of Statistical Computation and Simulation 92.5 (2022), pp. 928–956. DOI: 10.1080/00949655.2021.1981324.

[6] M. A. Meraou, M. Z. Raqab, and F. B. Almathkour. “Analyzing insurance data with an alpha power transformed exponential Poisson model”. In: Annals of Data Science 12.3 (2025), pp. 991–1011. DOI: 10.1007/s40745-024-00554-z.

[7] J. Merupula, V. Vaidyanathan, and C. Chesneau. “Prediction Interval for Compound Conway–Maxwell–Poisson Regression Model with Application to Vehicle Insurance Claim Data”. In: Mathematical and Computational Applications 28.2 (2023), p. 39. DOI: 10.3390/mca28020039.

[8] N. Počuča, P. Jevtić, P. D. McNicholas, et al. “Modeling frequency and severity of claims with the zero-inflated generalized cluster-weighted models”. In: Insurance: Mathematics and Economics 94 (2020), pp. 79–93. DOI: 10.1016/j.insmatheco.2020.06.004.

[9] G. Willame, J. Trufin, and M. Denuit. “Boosted Poisson regression trees: a guide to the BT package in R”. In: Annals of Actuarial Science 18.3 (2024), pp. 605–625. DOI: 10.1017/S174849952300026X.

Risk measure

[1] A. G. Abubakari. “Actuarial measures, regression, and applications of exponentiated Fréchet loss distribution”. In: International Journal of Mathematics and Mathematical Sciences 2022.1 (2022), p. 3155188. DOI: 10.1155/2022/3155188.

[2] A. Y. Bin-Nun, C. Lizarazo, A. Panasci, et al. “What do surrogate safety metrics measure? Understanding driving safety as a continuum”. In: Accident Analysis & Prevention 195 (2024), p. 107245. DOI: 10.1016/j.aap.2023.107245.

[3] Y. Guan, Z. Jiao, and R. Wang. “A reverse ES (CVaR) optimization formula”. In: North American Actuarial Journal 28.3 (2024), pp. 611–625. DOI: 10.1080/10920277.2023.2249524.

[4] A. Staino, E. Russo, M. Costabile, et al. “Minimum capital requirement and portfolio allocation for non-life insurance: a semiparametric model with Conditional Value-at-Risk (CVaR) constraint: A. Staino et al.” In: Computational Management Science 20.1 (2023), p. 12. DOI: 10.1007/s10287-023-00439-1.

Pricing insurance

[1] A. Brauer. “Enhancing actuarial non-life pricing models via transformers”. In: European Actuarial Journal 14.3 (2024), pp. 991–1012. DOI: 10.1007/s13385-024-00388-2.

[2] Ł. Delong, M. Lindholm, and M. V. Wüthrich. “Making Tweedie’s compound Poisson model more accessible”. In: European Actuarial Journal 11.1 (2021), pp. 185–226. DOI: 10.1007/s13385-021-00264-3.

[3] F. Holvoet, K. Antonio, and R. Henckaerts. “Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff”. In: North American Actuarial Journal 29.3 (2025), pp. 519–562. DOI: 10.1080/10920277.2025.2451860.

[4] M. Lindholm, F. Lindskog, and J. Palmquist. “Local bias adjustment, duration-weighted probabilities, and automatic construction of tariff cells”. In: Scandinavian Actuarial Journal 2023.10 (2023), pp. 946–973. DOI: 10.1080/03461238.2023.2176251.

[5] M. Lindholm and T. Nazar. “On duration effects in non-life insurance pricing”. In: European Actuarial Journal 14.3 (2024), pp. 809–832. DOI: 10.1007/s13385-024-00385-5.

[6] M. Lindholm and J. Palmquist. “Black-box guided generalised linear model building with non-life pricing applications”. In: Annals of Actuarial Science 18.3 (2024), pp. 675–691. DOI: 10.1017/S1748499524000265.

[7] M. A. Meraou, N. M. Al-Kandari, M. Z. Raqab, et al. “Analysis of skewed data by using compound Poisson exponential distribution with applications to insurance claims”. In: Journal of Statistical Computation and Simulation 92.5 (2022), pp. 928–956. DOI: 10.1080/00949655.2021.1981324.

[8] M. Meraou, N. Al-Kandari, and M. Raqab. “Univariate and bivariate compound models based on random sum of variates with application to the insurance losses data”. In: Journal of Statistical Theory and Practice 16.4 (2022), p. 56. DOI: 10.1007/s42519-022-00282-8.

[9] R. Wang, H. Shi, and J. Cao. “A Nested GLM Framework with Neural Network Encoding and Spatially Constrained Clustering in Non-Life Insurance Ratemaking”. In: North American Actuarial Journal 29.3 (2025), pp. 645–661. DOI: 10.1080/10920277.2024.2442416.

[10] X. Xin and F. Huang. “Antidiscrimination insurance pricing: Regulations, fairness criteria, and models”. In: North American Actuarial Journal 28.2 (2024), pp. 285–319. DOI: 10.1080/10920277.2023.2190528.

Extreme value analysis

[1] M. Allouche, J. El Methni, and S. Girard. “A refined Weissman estimator for extreme quantiles”. In: Extremes 26.3 (2023), pp. 545–572. DOI: 10.1007/s10687-022-00452-8.

[2] S. Girard, G. Stupfler, and A. Usseglio-Carleve. “On automatic bias reduction for extreme expectile estimation”. In: Statistics and Computing 32.4 (2022), p. 64. DOI: 10.1007/s11222-022-10118-x.

[3] J. Meng and K. Chan. “Penalized quasi-likelihood estimation of generalized Pareto regression–consistent identification of risk factors for extreme losses”. In: Insurance: Mathematics and Economics 104 (2022), pp. 60–75. DOI: 10.1016/j.insmatheco.2022.01.005.

[4] G. Stupfler and A. Usseglio-Carleve. “Composite bias-reduced L p-quantile-based estimators of extreme quantiles and expectiles”. In: Canadian Journal of Statistics 51.2 (2023), pp. 704–742. DOI: 10.1002/cjs.11703.

Multivariate and copula models

[1] A. Brouste, C. Dutang, L. Hovsepyan, et al. “Fast inference in copula models with categorical explanatory variables using the one-step procedure”. In: Computational Statistics 41.1 (2026), p. 23. DOI: 10.1007/s00180-025-01692-5.

[2] N. W. Deresa, I. Van Keilegom, and K. Antonio. “Copula-based inference for bivariate survival data with left truncation and dependent censoring”. In: Insurance: Mathematics and Economics 107 (2022), pp. 1–21. DOI: 10.1016/j.insmatheco.2022.07.011.

[3] Q. Hoang, P. Khandelwal, and S. Ghosh. “Robust predictive model using copulas”. In: Data-Enabled Discovery and Applications 3.1 (2019), p. 8. DOI: 10.1007/s41688-019-0032-y.

[4] M. Meraou, N. Al-Kandari, and M. Raqab. “Univariate and bivariate compound models based on random sum of variates with application to the insurance losses data”. In: Journal of Statistical Theory and Practice 16.4 (2022), p. 56. DOI: 10.1007/s42519-022-00282-8.

[5] S. F. Syed Yusoff Alhabshi, Z. H. Zamzuri, and S. N. Mohd Ramli. “Monte carlo simulation of the moments of a copula-dependent risk process with weibull interwaiting time”. In: Risks 9.6 (2021), p. 109. DOI: 10.3390/risks9060109.

Bayesian analysis

[1] P. Goffard and P. J. Laub. “Approximate Bayesian Computations to fit and compare insurance loss models”. In: Insurance: Mathematics and Economics 100 (2021), pp. 350–371. DOI: 10.1016/j.insmatheco.2021.06.002.

[2] F. Ungolo and E. R. van den Heuvel. “A Dirichlet process mixture regression model for the analysis of competing risk events”. In: Insurance: Mathematics and Economics 116 (2024), pp. 95–113. DOI: 10.1016/j.insmatheco.2024.02.004.

Other topics

[1] H. M. Aljohani. “Statistical inference for a novel distribution using ranked set sampling with applications”. In: Heliyon 10.5 (2024). DOI: 10.1016/j.heliyon.2024.e26893.

[2] B. Avanzi, E. Dong, P. J. Laub, et al. “Distributional refinement network: Distributional forecasting via deep learning”. In: Insurance: Mathematics and Economics (2026), p. 103246. DOI: 10.1016/j.insmatheco.2026.103246.

[3] B. Avanzi, G. Taylor, and M. Wang. “SPLICE: a synthetic paid loss and incurred cost experience simulator”. In: Annals of Actuarial Science 17.1 (2023), pp. 7–35. DOI: 10.1017/S1748499522000057.

[4] M. Bladt and C. B. Gardner. “Joint discrete and continuous matrix distribution modeling”. In: Stochastic Models 40.1 (2024), pp. 1–37. DOI: 10.1080/15326349.2023.2185257.

[5] A. Brouste, C. Dutang, and T. Rohmer. “A closed-form alternative estimator for GLM with categorical explanatory variables”. In: Communications in Statistics-Simulation and Computation 53.5 (2024), pp. 2444–2460. DOI: 10.1080/03610918.2022.2076870.

[6] R. Henckaerts, K. Antonio, and M. Côté. “When stakes are high: Balancing accuracy and transparency with Model-Agnostic Interpretable Data-driven suRRogates”. In: Expert Systems with Applications 202 (2022), p. 117230. DOI: 10.1016/j.eswa.2022.117230.

[7] D. Lim, A. Neufeld, S. Sabanis, et al. “Langevin dynamics based algorithm e-TH \(\varepsilon\) O POULA for stochastic optimization problems with discontinuous stochastic gradient”. In: Mathematics of Operations Research 50.3 (2025), pp. 2333–2374. DOI: 10.1287/moor.2022.0307.

[8] A. Majeed. “Accelerated failure time models: An application in insurance attrition”. In: The Journal of Risk Management and Insurance (2020).

[9] T. Miljkovic and D. Fernández. “On two mixture-based clustering approaches used in modeling an insurance portfolio”. In: Risks 6.2 (2018), p. 57. DOI: 10.3390/risks6020057.

[10] T. Miljkovic and P. Wang. “A dimension reduction assisted credit scoring method for big data with categorical features”. In: Financial Innovation 11.1 (2025), p. 29. DOI: 10.1186/s40854-024-00689-1.

[11] J. Ponnet, J. Raymaekers, and T. Verdonck. “Fast thresholded concordance probability for evolutionary optimization”. In: Swarm and Evolutionary Computation 78 (2023), p. 101260. DOI: 10.1016/j.swevo.2023.101260.

[12] R. Richman and M. V. Wüthrich. “LocalGLMnet: interpretable deep learning for tabular data”. In: Scandinavian Actuarial Journal 2023.1 (2023), pp. 71–95. DOI: 10.1080/03461238.2022.2081816.

[13] R. Richman and M. V. Wüthrich. “Nagging predictors”. In: Risks 8.3 (2020), p. 83. DOI: 10.3390/risks8030083.

[14] P. Shi and K. Shi. “Non-life insurance risk classification using categorical embedding”. In: North American Actuarial Journal 27.3 (2023), pp. 579–601. DOI: 10.1080/10920277.2022.2123361.

[15] S. C. Tseung, A. L. Badescu, T. C. Fung, et al. “LRMoE.jl: a software package for insurance loss modelling using mixture of experts regression model”. In: Annals of Actuarial Science 15.2 (2021), pp. 419–440. DOI: 10.1017/S1748499521000087.

[16] M. V. Wüthrich and J. Ziegel. “Isotonic recalibration under a low signal-to-noise ratio”. In: Scandinavian Actuarial Journal 2024.3 (2024), pp. 279–299. DOI: 10.1080/03461238.2023.2246743.