Follow
Peter Baldwin
Peter Baldwin
Principal Measurement Scientist, National Board of Medical Examiners
Verified email at nbme.org
Title
Cited by
Cited by
Year
Predicting the difficulty of multiple choice questions in a high-stakes medical exam
V Yaneva, P Baldwin, J Mee
Proceedings of the fourteenth workshop on innovative use of NLP for building …, 2019
572019
Predicting the difficulty and response time of multiple choice questions using transfer learning
K Xue, V Yaneva, C Runyon, P Baldwin
Proceedings of the fifteenth workshop on innovative use of NLP for building …, 2020
392020
Predicting item survival for multiple choice questions in a high-stakes medical exam
V Yaneva, P Baldwin, J Mee
Proceedings of the Twelfth Language Resources and Evaluation Conference …, 2020
282020
Using natural language processing to predict item response times and improve test construction
P Baldwin, V Yaneva, J Mee, BE Clauser, LA Ha
Journal of Educational Measurement 58 (1), 4-30, 2021
242021
Using item response time data in test development and validation: Research with beginning computer users
AL Zenisky, P Baldwin
Center for educational assessment report No 593, 2006
242006
Massachusetts adult proficiency tests technical manual, version 2
SG Sireci, P Baldwin, A Martone, AL Zenisky, L Kaira, W Lam, CL Shea, ...
Center for Educational Assessment Research Report No 677, 2008
192008
Hip psychometrics
P Baldwin, J Bernstein, H Wainer
Statistics in Medicine 28 (17), 2277-2292, 2009
182009
Comparison of automated scoring methods for a computerized performance assessment of clinical judgment
P Harik, P Baldwin, B Clauser
Applied Psychological Measurement 37 (8), 587-597, 2013
172013
A comparison of IRT equating methods on recovering item parameters and growth in mixed-format tests
SG Baldwin, P Baldwin, ML Nering
annual meeting of the American Educational Research Association, Chicago, IL, 2007
172007
A comparison of experimental and observational approaches to assessing the effects of time constraints in a medical licensing examination
P Harik, BE Clauser, I Grabovsky, P Baldwin, MJ Margolis, D Bucak, ...
Journal of Educational Measurement 55 (2), 308-327, 2018
162018
Weighting components of a composite score using naïve expert judgments about their relative importance
P Baldwin
Applied Psychological Measurement 39 (7), 539-550, 2015
142015
An experimental study of the internal consistency of judgments made in bookmark standard setting
BE Clauser, P Baldwin, MJ Margolis, J Mee, M Winward
Journal of Educational Measurement 54 (4), 481-497, 2017
132017
Examining ChatGPT Performance on USMLE Sample Items and Implications for Assessment
V Yaneva, P Baldwin, DP Jurich, K Swygert, BE Clauser
Academic Medicine, 10.1097, 2023
102023
A strategy for developing a common metric in item response theory when parameter posterior distributions are known
P Baldwin
Journal of Educational Measurement 48 (1), 1-11, 2011
102011
The effect of rating unfamiliar items on Angoff passing scores
JC Clauser, RK Hambleton, P Baldwin
Educational and psychological measurement 77 (6), 901-916, 2017
92017
Findings from the First Shared Task on Automated Prediction of Difficulty and Response Time for Multiple-Choice Questions
V Yaneva, K North, P Baldwin, S Rezayi, Y Zhou, SR Choudhury, P Harik, ...
Proceedings of the 19th Workshop on Innovative Use of NLP for Building …, 2024
82024
The choice of response probability in bookmark standard setting: an experimental study
P Baldwin, MJ Margolis, BE Clauser, J Mee, M Winward
Educational Measurement: Issues and Practice 39 (1), 37-44, 2020
82020
Assessing the impact of modifications to the documentation component’s scoring rubric and rater training on USMLE integrated clinical encounter scores
SG Baldwin, P Harik, LA Keller, BE Clauser, P Baldwin, TA Rebbecchi
Academic Medicine 84 (10), S97-S100, 2009
72009
Massachusetts adult proficiency tests technical manual
SG Sireci, P Baldwin, A Martone, A Zenisky, RK Hambleton, KT Han
Center for Educational Assessment, University of Massachusetts Amherst, 2006
62006
A modified IRT model intended to improve parameter estimates under small sample conditions
P Baldwin
annual meeting of the National Council on Measurement in Education, San …, 2006
62006
The system can't perform the operation now. Try again later.
Articles 1–20