COURSE OVERVIEW

 

Today researchers across a wide variety of fields find themselves having to analyse an increasing amount of qualitative information. The objective of this summer school therefore, is to provide participants the requisite toolkit necessary for the successful planning, conducting and subsequent statistical analysis of qualitative text. To this end, an overview of the following methodologies: qualitative analysis, quantitative content analysis and text mining, to text analysis is provided. The opening sessions focus on the fundamental role of data preparation to the analysis, before moving on to identifying themes and correlations using both the text mining and content analysis approach. The final sessions address the more advanced topics of importing and exporting data, together with document classification.

TARGET AUDIENCE 

 

The summer school is aimed at:

 

Academic researchers, evaluators, policy advisers, social workers, educators and students working in economics, public health, sociology, psychology and political science;

 

Data mining and market research analysts based in the automotive, market research, logistics or transportation, telecommunications sectors, needing to analyze comments from surveys, blogs, websites, social media platforms and other textual format sources;

 

Insurance analysts needing to analyze and categorize claims from customers;

 

Researches based in pharmaceutical companies and medical research laboratories required to analyze healthcare reports, notes from medical doctors, interviews and/or focus groups with patients.

 

PROGRAM


SETTING THE SCENE 

 

SESSION I: THREE APPROACHES TO TEXT ANALYSIS

 

Qualitative Analysis
Quantitative Content Analysis
Text Mining

 

SESSION II: QDA MINER AND WORDSTAT  – A BRIEF OVERVIEW

 

QDA MINER

 

Introduction and project management
Codebook management and manual coding
Security features and text retrieval tools
Coding Frequency and Retrieval
Code co-occurrence and case similarity analysis
Assessing relationship between coding and variables
Using the Report Manager and the Command Log
Performing teamwork
Miscellaneous Functions

 

WORDSTAT

 

Content Analysis or Text Mining
Analyzing words without dictionaries – a text mining approach
Content Analysis – Principles of dictionary construction
Importing and exporting data
Introduction to automatic document classification

QDA MINER

 

SESSION I: INTRODUCTION AND PROJECT MANAGEMENT

 

 

Introduction to CAQDAS using QDA Miner

 

The CASE x VARIABLE file structure
The Mixed-Method approach

 

Quick overview of the work environment

 

The four windows – CASE, VARIABLES, CODES, and DOCUMENT
The menu system

 

Creating of a new project

 

Creating a new project from a list of documents
Creating a new project from an existing data file
Creating an empty project / defining structure
Using the document conversion wizard

 

Customizing and personalizing the project

 

The PROJECT | PROPERTIES dialog
The PROJECT | NOTES command

 

Manipulating variables

 

Adding a variable
Deleting a variable
Changing the variable data type
Recoding the values of a variable
Reordering variables
Changing variable properties

 

Manipulating cases
Add a new case

 

Deleting cases
Importing new documents in new cases
Changing the case grouping and description

 

SESSION II:CODEBOOK MANAGEMENT AND MANUAL CODING

 

Creating codes and managing the codebook

 

Creating codes and categories
Modifying an existing code
Delete existing codes
Moving codes in the codebook
Merging codes in the codebook
Splitting codes in the codebook
Importing an existing codebook

 

Manual coding of documents (versus autocoding)

 

The four basic methods for assigning codes to text segments:

 

Highlight text segment then drag a code
Highlight text segment then double-click a code
Highlight text segment then select code and button (toolbar)
Drag and drop a code over a paragraph (or a sentence – press ALT)

 

Assignment of multiple codes to the same segment (press CTRL)

 

Modifying existing coding

 

Working with code marks
Viewing coding information
Adding a comment to a coding
Remove a coding
Change the code assigned to a text segment
Resizing a segment
Consolidating codes
Searching and replacing codes
Hiding code marks
Highlighting coded segments

 

SESSION III: SECURITY FEATURES AND TEXT RETRIEVAL TOOLS

 

Using backup features

 

Creating a permanent backup
Restoring a backup
Using the temporary session backup

 

Text retrieval tools (4)

 

Searching for text
Performing a simple text search
Performing a complex text search (using Boolean and wildcard
Performing a thesaurus search
Using the “search hits” table
Performing manual coding and autocoding
Saving to disk or printing the table

 

Retrieving sections in structured documents
Performing a query by example

 

Finding text similar to a sample text segment
Providing relevance feedback to improve search results
Finding text similar to specific coded segments
Performing a “fuzzy string matching”

 

Performing a keyword search

 

Assigning keywords to codes
Performing a keyword retrieval on internal codes
Performing a keyword retrieval on WordStat dictionary files

 

SESSION IV: CODING FREQUENCY AND RETRIEVAL

 

Coding frequency

 

Creating a frequency list of all codes
Creating a barchart or a pie chart on selected codes
Customizing the chart

 

Coding Retrieval

 

Performing a simple coding retrieval

 

 

Performing a complex search

Creating a text report
Creating a new project from
A shortcut for simple coding retrieval

Saving and Retrieving Queries
Retrieving a list of comments

 

SESSION V: CODE CO-OCCURRENCE AND CASE SIMILARITY ANALYSIS

 

Analyzing codes co-occurrences

 

Hierarchical clustering of codes
2D and 3D multidimensional scaling plots
Using the Proximity plots
Assessing similarity of cases

 

Analyzing code sequences

 

Choosing codes and setting minimum / maximum distances
Using the Sequence matrix
Searching and coding specific sequences

 

SESSION VI: ASSESSING RELATIONSHIP BETWEEN CODING AND VARIABLES

 

Analyzing coding by variables

 

Crosstabulating coding frequency by variables
Setting the content and format of the table
Computing correlation or comparison statistics
Comparing frequencies using barcharts or line charts
Creating and interpreting 2D and 3D correspondence plots
Creating and interpreting heatmaps

 

A quick overview of graphic coding features

 

SESSION VII: USING THE REPORT MANAGER AND THE COMMAND LOG

 

Using the Report Manager

 

Accessing the Report Manager
The Report Manager interface
Appending tables, graphics and quotes
Moving and organizing items using the table of content
Editing existing items / adding comments
Adding empty documents or folders and deleting existing items
Importing documents, images or tables
Searching and replacing text
Exporting results to HTML, Word or RTF files

 

Using the Command Log

 

Introduction to the command log – Filtering log entries
Adding comments to log entries
Undoing previously performed operations
Repeating previously performed operations
Exporting the log table to disk

 

SESSION VIII: PERFORMING TEAMWORK

 

Preparing projects for teamwork
Creating user accounts and setting privileges
Creating new accounts
Defining users access rights
Forcing users to log in
Creating duplicate copies of a project
Sending a project by email
Merging projects and assessing coding reliability
Merging two or more projects

 

Planning teamwork for assessing coding agreement
Adjusting colors of code marks
Computing coding agreement
The codebook and segmentation problems
Four levels of agreement
Presence or absence (0 or 1)
Frequency (0, 1, 2, etc.)
Coding importance (% of words)
Coding overlap
Correcting (or not) for chance agreement
Identifying disagreements

 

WORDSTAT

 

SESSION IX: BASIC WORD STATISTICS AND TEXT MINING

 

Content Analysis or Text Mining
Running WordStat from QDA Miner or Simstat
Analyzing words without dictionaries – a text mining approach
Data preparation – misspelling and control characters
Basic word frequency analysis

 

Application of text pre-processing methods
Exclusion list – use with care
Lemmatization and stemming – limits and benefits
Setting upper and lower frequency criteria
A few additional options
Numeric and other non-alphabetic characters Braces and square brackets
Random sampling
Using disk or memory as the working space

 

Identifying themes using word co-occurrence analysis

 

Clustering words and measuring their proximity
Clustering documents based on the words they contains

 

Correlation and comparison analysis based on word usage

 

Performing crosstabs and computing statistics
Comparing words among the sources (document or text variables)
Correspondence analysis and heatmaps

 

SESSION X: CONTENT ANALYSIS PRINCIPLES OF DICTIONARY CONSTRUCTION

 

Introduction to WordStat categorization dictionary

 

Dictionary structure and functions
Opening, saving, and creating categorization dictionaries
Creating manually categories of words and phrases
Principles of dictionary construction – Extracting features
Identification of technical terms and proper names (persons, places, products)
Identification of common misspellings
Extracting phrases
Creating an initial dictionary – Phrases technical terms and proper nouns words
Adding words manually
Adding words from tables Using the drag and drop editor
Organizing the dictionary (drag and drop)

 

Applying the dictionary

 

Setting different levels
Mixing dictionaries with words

 

Validating the dictionary

 

Finding words or phrases with improper meanings using the KWIC list
WordStat evaluation order – how to use this at your advantage
Disambiguation methods
Manual disambiguation Disambiguation using phrases Disambiguation using rules

 

Improving categorization dictionaries

 

Creating comprehensive dictionaries using the Suggest button.
Assessing coverage using the keyword retrieval feature

 

SESSION XI: ADVANCED FEATURES

 

Importing and exporting data
Exportation of frequency data