Published By: Aalto University, 3/16/2017
An Aalto University graduate student is developing a new family-tree algorithm called AncestryAI that uses Finnish parish registers to generate a probable family tree. Register data comes from HisKi, an open genealogical database supported by volunteer data entry. AncestryAI users can leave feedback on the accuracy of their tree, which is used to further train the algorithm.
Flesch-Kincaid Grade Level of Article: 12.6
Extended Discussion Questions
- How does this story demonstrate ways in which the Internet enables large-scale collaboration?
- For example, what might the HisKi parish-records database project have looked like without the Internet?
- How might the developer have tried to get feedback on the algorithm’s accuracy without the Internet?
- The article points out that comprehensive family trees are often difficult to construct because of missing information.
- Can you think of other examples where machine learning is used to fill in gaps in data?
- What are the advantages of using a computational approach, over trying to reconstruct missing information by hand?
- What are the disadvantages or limitations of the computational approach?
- The article mentions that the system can only use records through the early twentieth century, because more recent records aren’t public yet.
- How does this affect the potential impact of the system?
- Do you think these kinds of limitations on making church or community records public are a good idea? Why or why not?
- What do you think the people recording their children’s baptisms in parish registers a hundred years ago would have thought, if they knew this information would later become so widely available?
Relating This Story to the CSP Curriculum Framework
Global Impact Learning Objectives:
- LO 7.1.1 Explain how computing innovations affect communication, interaction, and cognition.
- LO 7.1.2 Explain how people participate in a problem-solving process that scales.
Global Impact Essential Knowledge:
- EK 7.1.1F Public data provides widespread access and enables solutions to identified problems.
- EK 7.1.1M The Internet and the Web have enhanced methods of and opportunities for communication and collaboration.
- EK 7.1.2C Human computation harnesses contributions from many humans to solve problems related to digital data and the Web.
- EK 7.1.2E Some online services use the contributions of many people to benefit both individuals and society.
- EK 7.1.2F Crowdsourcing offers new models for collaboration, such as connecting people with jobs and businesses with funding.
- EK 7.2.1A Machine learning and data mining have enabled innovation in medicine, business, and science.
Other CSP Big Ideas:
- Idea 3 Data and Information
- Idea 4 Algorithms
Banner Image: “Network Visualization – Violet – Offset Crop“, derivative work by ICSI. New license: CC BY-SA 4.0. Based on “Social Network Analysis Visualization” by Martin Grandjean. Original license: CC BY-SA 3.0.