Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
208 views
in Technique[技术] by (71.8m points)

python - How is data from complex sampling designs handled in machine learning?

I am using Machine Learning Techniques in my projects in the field of education. I am looking for some help an answer to the below-mentioned questions. I would appreciate it if you could help me to find answers to these questions.

I am using nationally representative data in my project, and the data were collected through a two-stage sampling process.

Since complex sampling was used to select participants, statistical models usually would account for the unequal probability of selection of individual participants (e.g., using sampling weights), stratification/blocking, and non-independence of student outcomes within schools to obtain representative population estimates. How is data from complex sampling designs handled in machine learning? (I would expect that at least the probability weights may need to be accounted for, but this is not my expertise.)

Another way to approach this question may be: Does machine learning require a starting assumption that the observed data was obtained by random or representative sampling? Many of these methods sample from the initial dataset as part of the analytic algorithm, but I am wondering about necessary conditions for the initial dataset.

I would greatly appreciate your inputs!

question from:https://stackoverflow.com/questions/65713572/how-is-data-from-complex-sampling-designs-handled-in-machine-learning

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...