xiredmt2 Index du Forum
xiredmt2 Index du ForumFAQRechercherS’enregistrerConnexion

Statistical Significance Testing In Information Retrieval

Poster un nouveau sujet   Répondre au sujet    xiredmt2 Index du Forum -> xiredmt2 -> Vote
Sujet précédent :: Sujet suivant  
Auteur Message

Hors ligne

Inscrit le: 06 Avr 2016
Messages: 133
Localisation: Roma
Personnage Metin2: 0

MessagePosté le: Jeu 6 Juil - 14:42 (2017)    Sujet du message: Statistical Significance Testing In Information Retrieval Répondre en citant

The past 20 years have seen a great improvement in the rigor of information retrieval experimentation, due primarily to two factors: high-quality, public, portable test collections such as those produced by TREC (the Text REtrieval Conference), and the increased practice of statistical hypothesis testing to determine whether measured improvements can be ascribed to something other than random chance. Together these create a very useful standard for reviewers, program commit- tees, and journal editors; work in information retrieval (IR) increasingly cannot be published unless it has been evaluated using a well-constructed test collection and shown to produce a statistically significant improvement over a good baseline. But, as the saying goes, any tool sharp enough to be useful is also sharp enough to be dangerous. Statistical tests of significance are widely misunderstood. Most researchers and developers treat them as a "black box": evaluation results go in and a p-value comes out. But because significance is such an important factor in determining what research directions to explore and what is published, using p-values obtained without thought can have consequences for everyone doing research in IR. Ioannidis has argued that the main consequence in the biomedical sciences is that most published research findings are false; could that be the case in IR as well? Our goal with this work is to help researchers and developers gain a better understanding of how tests work and how they should be interpreted so that they can both use them more effectively in their day-to-day work as well as better understand how to interpret them when reading the work of others. We will do this primarily with three tools: (a) mathematical analysis; (b) simulation; and (c) experimentation with TREC data - because of the availability of TREC data, IR as a field is uniquely positioned to be able to evaluate significance testing in the presence of a wide variety of "failed" experiments.


Environmental Accounting and Reporting: Theory and Practice (CSR, Sustainability, Ethics & Governance)Ham Radio TrackingDog Anxiety?: Methods For Separation Anxiety For Your Dog! (Mav4Life)Journal Weddings Diamond Ring Roses: (Notebook, Diary, Blank Book) (Wedding Journals Notebooks Diaries)10 Low Carb Salad Dressing Recipes For Healthy Living: For Healthy LivingLife at the CottageLA CUCINA OLANDESE E' LA MIGLIORE DEL MONDO: CENTINAIA DI RICETTE ORIGINALI, TRADIZIONALI E MODERNE - NOVITA' PER MASSAIE E CUOCHI DI PROFESSIONE (Italian Edition)Heating and Ventilating Buildings. An Elementary Treatisejyouhousekyurithi manejimento yougosyuu (Japanese Edition) raisensu laboJ'organise mon mariage aux Seychelles: Guide pas à pas pour organiser soi-même son mariage aux Seychelles (French Edition)

Revenir en haut

MessagePosté le: Jeu 6 Juil - 14:42 (2017)    Sujet du message: Publicité

PublicitéSupprimer les publicités ?
Revenir en haut
Montrer les messages depuis:   
Poster un nouveau sujet   Répondre au sujet    xiredmt2 Index du Forum -> xiredmt2 -> Vote Toutes les heures sont au format GMT + 1 Heure
Page 1 sur 1

Sauter vers:  

Index | creer un forum | Forum gratuit d’entraide | Annuaire des forums gratuits | Signaler une violation | Conditions générales d'utilisation
Template lost-kingdom_Tolede created by larme d'ange
Powered by phpBB © 2001, 2005 phpBB Group
Traduction par : phpBB-fr.com