信息

内容

Books 美国大学生数学建模竞赛 2012 ICM Problem：Modeling for Crime Busting

2012 ICM Problem：Modeling for Crime Busting

Your organization, the Intergalactic Crime Modelers (ICM), is investigating a conspiracy to commit a criminal act. The investigators are highly confident they know several members of the conspiracy, but hope to identify the other members and the leaders before they make arrests. The conspirators and the possible suspected conspirators all work for the same company in a large office complex. The company has been growing fast and making a name for itself in developing and marketing computer software for banks and credit card companies. ICM has recently found a small set of messages from a group of 82 workers in the company that they believe will help them find the most likely candidates for the unidentified co‐conspirators and unknown leaders. Since the message traffic is for all the office workers in the company, it is very likely that some (maybe many) of the identified communicators in the message traffic are not involved in the conspiracy. In fact, they are certain that they know some people who are not in the conspiracy. The goal of the modeling effort will be to identify people in the office complex who are the most likely conspirators. A priority list would be ideal so ICM could investigate, place under surveillance, and/or interrogate the most likely candidates. A discriminate line separating conspirators from non‐conspirators would also be helpful to distinctly categorize the people in each group. It would also be helpful to the DA’s office if the model nominated the conspiracy leaders.

Before the data of the current case are given to your crime modeling team, your supervisor gives you the following scenario (called Investigation EZ) that she worked on a few years ago in another city. Even though she is very proud of her work on the EZ case, she says it is just a very small, simple example, but it may help you understand your task. Her data follow:

The ten people she was considering as conspirators were: Anne#, Bob, Carol, Dave*, Ellen, Fred, George*, Harry, Inez, and Jaye#. (* indicates prior known conspirators, # indicate prior known non‐ conspirators)

Chronology of the 28 messages that she had for her case with a code number for the topic of each message that she assigned based on her analysis of the message:

Anne to Bob: Why were you late today? (1)

Bob to Carol: That darn Anne always watches me. I wasn't late. (1)

Carol to Dave: Anne and Bob are fighting again over Bob's tardiness. (1)

Dave to Ellen: I need to see you this morning. When can you come by? Bring the budget files. (2)

Dave to Fred: I can come by and see you anytime today. Let me know when it is a good time. Should I bring the budget files? (2)

Dave to George: I will see you later ‐‐‐ lots to talk about. I hope the others are ready. It is important to get this right. (3)

Harry to George: You seem stressed. What is going on? Our budget will be fine. (2) (4)

Inez to George: I am real tired today. How are you doing? ( 5)

Jaye to Inez: Not much going on today. Wanna go to lunch today? (5)

Inez to Jaye: Good thing it is quiet. I am exhausted. Can't do lunch today ‐‐‐ sorry! (5)

George to Dave: Time to talk ‐‐‐ now! (3)

Jaye to Anne: Can you go to lunch today? (5)

Dave to George: I can't. On my way to see Fred. (3)

George to Dave: Get here after that. (3)

Anne to Carol: Who is supposed to watch Bob? He is goofing off all the time. (1)

Carol to Anne: Leave him alone. He is working well with George and Dave. (1)

George to Dave: This is important. Darn Fred. How about Ellen? (3)

Ellen to George: Have you talked with Dave? (3)

George to Ellen: Not yet. Did you? (3)

Bob to Anne: I wasn't late. And just so you know ‐‐‐ I am working through lunch. (1)

Bob to Dave: Tell them I wasn't late. You know me. (1)

Ellen to Carol: Get with Anne and figure out the budget meeting schedule for next week and help me calm George. (2)

Harry to Dave: Did you notice that George is stressed out again today? (4)

Dave to George: Darn Harry thinks you are stressed. Don't get him worried or he will be nosing around. (4)

George to Harry: Just working late and having problems at home. I will be fine. (4)

Ellen to Harry: Would it be OK, if I miss the meeting today? Fred will be there and he knows the budget better than I do. (2)

Harry to Fred: I think next year's budget is stressing out a few people. Maybe we should take time to reassure people today. (2) (4)

Fred to Harry: I think our budget is pretty healthy. I don't see anything to stress over. (2) END of MESSAGE TRAFFIC

Your supervisor points outs that she assigned and coded only 5 different topics of messages: 1) Bob's tardiness, 2) the budget, 3) important unknown issue but assumed to be part of conspiracy, 4) George's stress, and 5) lunch and other social issues. As seen in the message coding, some messages had two topics assigned because of the content of the messages.

The way your supervisor analyzed her situation was with a network that showed the communication links and the types of messages. The following figure is a model of the message network that resulted with the code for the types of messages annotated on the network graph.

Figure 1: Network of messages from EZ Case

Your supervisor points out that in addition to known conspirators George and Dave, Ellen and Carol were indicted for the conspiracy based on your supervisor's analysis and later Bob self‐admitted his involvement in a plea bargain for a reduced sentence, but the charges against Carol were later dropped. Your supervisor is still pretty sure Inez was involved, but the case against her was never established. Your supervisor's advice to your team is identify the guilty parties so that people like Inez don't get off, people like Carol are not falsely accused, and ICM gets the credit so people like Bob do not have the opportunity to get reduced sentences.

The current case:

Your supervisor has put together a network‐like database for the current case, which has the same structure, but is a bit larger in scope. The investigators have some indications that a conspiracy is taking place to embezzle funds from the company and use internet fraud to steal funds from credit cards of people who do business with the company. The small example she showed you for case EZ had only 10 people (nodes), 27 links (messages), 5 topics, 1 suspicious/conspiracy topic, 2 known conspirators, and 2

known non‐conspirators. So far, the new case has 83 nodes, 400 links (some involving more than 1 topic), over 21,000 words of message traffic, 15 topics (3 have been deemed to be suspicious), 7 known conspirators, and 8 known non‐conspirators. These data are given in the attached spreadsheet files: Names.xls, Topics.xls, and Messages.xls. Names.xls contains the key of node number to the office workers' names. Topics.xls contains the code for the 15 topic numbers to a short description of the topics. Because of security and privacy issues, your team will not have direct transcripts of all the message traffic. Messages.xls provides the links of the nodes that transmitted messages and the topic code numbers that the messages contained. Several messages contained up to three topics. To help visualize the message traffic, a network model of the people and message links is provided in Figure 2. In this case, the topics of the messages are not shown in the figure as they were in Figure 1. These topic numbers are given in the file Messages.xls and described in Topics.xls.

Figure 2: Visual of the network model of the 83 people (nodes) and 400 messages between these people (links).

Requirements:

Requirement 1: So far, it is known that Jean, Alex, Elsie, Paul, Ulf, Yao, and Harvey are conspirators. Also, it is known that Darlene, Tran, Jia, Ellin, Gard, Chris, Paige, and Este are not conspirators. The three known suspicious message topics are 7, 11, and 13. There is more detail about the topics in file Topics.xls. Build a model and algorithm to prioritize the 83 nodes by likelihood of being part of the conspiracy and explain your model and metrics. Jerome, Delores, and Gretchen are the senior managers of the company. It would be very helpful to know if any of them are involved in the conspiracy.

Requirement 2: How would the priority list change if new information comes to light that Topic 1 is also connected to the conspiracy and that Chris is one of the conspirators?

Requirement 3: A powerful technique to obtain and understand text information similar to this message traffic is called semantic network analysis; as a methodology in artificial intelligence and computational linguistics, it provides a structure and process for reasoning about knowledge or language. Another computational linguistics capability in natural language processing is text analysis. For our crime busting scenario, explain how semantic and text analyses of the content and context of the message traffic (if you could obtain the original messages) could empower your team to develop even better models and categorizations of the office personnel. Did you use any of these capabilities on the topic descriptions in file Topics.xls to enhance your model?

Requirement 4: Your complete report will eventually go to the DA so it must be detailed and clearly state your assumptions and methodology, but cannot exceed 20 pages of write up. You may include your programs as appendices in separate files that do not count in your page restriction, but including these programs is not necessary. Your supervisor wants ICM to be the world's best in solving white‐ collar, high‐tech conspiracy crimes and hopes your methodology will contribute to solving important cases around the world, especially those with very large databases of message traffic (thousands of people with tens of thousands of messages and possibly millions of words). She specifically asked you to include a discussion on how more thorough network, semantic, and text analyses of the message contents could help with your model and recommendations. As part of your report to her, explain the network modeling techniques you have used and why and how they can be used to identify, prioritize, and categorize similar nodes in a network database of any type, not just crime conspiracies and message data. For instance, could your method find the infected or diseased cells in a biological network where you had various kinds of image or chemical data for the nodes indicating infection probabilities and already identified some infected nodes?

*Your ICM submission should consist of a 1 page Summary Sheet and your solution cannot exceed 20 pages for a maximum of 21 pages.

Details

Revision #2
Created 4年前 by MathIF
Updated 4年前 by MathIF

Book Permissions Active

Attachments

Book Navigation

美国大学生数学建模竞赛

1985 MCM A：Animal Populations

1985 MCM B: Strategic Reserve Management

1986 MCM A: Hydrographic Data

1986 MCM B: The Emergency-Facilities Location

1987 MCM A: The Salt Storage Problem

1987 MCM B: Parking Lot Design

1988 MCM A: The Drug Runner Problem

1988 MCM B: The Railroad Flatcar Problem

1989 MCM A: The Midge Classification Problem

1989 MCM B: The Aircraft Queuing Problem

1990 MCM A: The Brain-Drug Problem

1990 MCM B: The Snowplow Problem

1991 MCM A：The Steiner Tree Problem

1991 MCM B：The Water Tank Problem

1992 MCM A：The Emergency PowerRestoration Problem

1992 MCM B：The Air-Traffic-Control Radar Problem

1993 MCM A：The Coal-Tipple Operations Problem

1993 MCM B：The Optimal Composting Problem

1994 MCM A：The Communications Network Problem

1994 MCM B：The Concrete Slab Problem

1995 MCM A： The Single Helix

1995 MCM B： Aluacha Balaclava College

1996 MCM A： The Submarine Detection Problem

1996 MCM B： The Contest Judging Problem

1997 MCM A： The Velociraptor Problem

1997 MCM B： Mix Well For Fruitful Discussions

1998 MCM A： MRI Scanners

1998 MCM B： Grade Inflation

1999 MCM A：Deep Impact

1999 MCM B： Unlawful Assembly

1999 ICM Problem： Ground Pollution

2000 MCM A： Air traffic Control

2000 MCM B：Radio Channel Assignments

2000 ICM Problem：Elephants: When is Enough, Enough?

2001 MCM A：Choosing a Bicycle Wheel

2001 MCM B: Escaping a Hurricane's Wrath (An Ill Wind...)

2001 ICM Problem：Our Waterways - An Uncertain Future

2002 MCM A：Wind and Waterspray

2002 MCM B：Airline Overbooking

2002 MCM Problem：If we SCRUB our land too much, we may lose the LIZARDs

2003 MCM A：The Stunt Person

2003 MCM B：Gamma Knife Treatment Planning

2003 ICM Problem：Aviation Baggage Screening Strategies: To Screen or Not to Screen, that is the Question

2004 MCM A：Are Fingerprints Unique?

2004 MCM B：A Faster QuickPass System

2004 ICM Problem：To Be Secure or Not to Be?

2005 MCM A：Flood Planning

2005 MCM B：Tollbooths

2005 ICM Problem：Nonrenewable Resources

2006 MCM A：Positioning and Moving Sprinkler Systems for Irrigation

2006 MCM B：Wheel Chair Access at Airports

2006 ICM Problem：Trade-offs in the fight against HIV/AIDS

2007 MCM A：Gerrymandering

2007 MCM B：The Airplane Seating Problem

2007 ICM PROBLEM：Organ Transplant: The Kidney Exchange Problem

2008 MCM A：Take a Bath

2008 MCM B：Creating Sudoku Puzzles

2008 ICM Problem：Finding the Good in Health Care Systems

2009 MCM A：Designing a Traffic Circle

2009 MCM B：Energy and the Cell Phone

2009 ICM Problem：Creating Food Systems: Re-Balancing Human-Influenced Ecosystems

2010 MCM A：The Sweet Spot

2010 MCM B：Criminology

2010 ICM Problem：The Great Pacific Ocean Garbage Patch

2011 MCM A：Snowboard Course

2011 MCM B：Repeater Coordination

2011 ICM Problem：How environmentally and economically sound are electric vehicles? Is their widespread use feasible and practical?

2012 MCM A：The Leaves of a Tree

2012 MCM B：Camping along the Big Long River

2012 ICM Problem：Modeling for Crime Busting

2013 MCM A：The Ultimate Brownie Pan

2013 MCM B：Water, Water, Everywhere

2013 ICM Problem A：Network Modeling of Earth's Health

2014 MCM A：The Keep-Right-Except-To-Pass Rule

2014 MCM B：College Coaching Legends

2014 ICM Problem： Using Networks to Measure Influence and Impact

2015 MCM A：Eradicating Ebola