Universities.sg | Methodology
Methodology & Data Sources
Data Sources
Indicative Grade Profiles (IGP) - A-level rank point cutoffs come directly from university admissions offices. We collect JSON data files published by NUS, NTU, and SMU for the 2024/2025 academic year. These show the 10th and 90th percentile scores of successful applicants from the previous admission exercise. The data represents actual admitted students, not minimum requirements. Source files: nus_alevel_grade_profile.json, ntu_alevel_grade_profile.json, smu_alevel_grade_profile.json in our raw data directory.
Graduate Employment Survey (GES) - Employment and salary data comes from SkillsFuture Singapore's annual Graduate Employment Survey. We process the consolidated CSV file containing results from 2013 to 2025, covering all six autonomous universities. The survey is conducted six months after graduation and includes employment rates (overall, full-time permanent) and salary statistics (median, mean, 25th/75th percentiles). The data excludes graduates in voluntary unemployment and those pursuing further studies. Official source: GraduateEmploymentSurveyNTUNUSSITSMUSUSSSUTD.csv, published at data.gov.sg.
Polytechnic GPA Cutoffs - GPA requirements for polytechnic diploma holders come from the same university sources as IGP data. These show the 10th and 90th percentile GPAs (on a 4.0 scale) of successful polytechnic applicants. Not all courses accept polytechnic students, and some have additional subject prerequisites. Source files: nus_poly_gpa_profile.json, ntu_poly_gpa_profile.json, smu_poly_gpa_profile.json.
Enrollment Statistics - Student enrollment numbers by programme come from university-specific JSON files for NUS, NTU, and SMU. This data helps identify programme size and availability but isn't displayed directly on the site. It's used internally for data validation and to identify discontinued programmes. Source files: nus-undergraduate-enrolment.json, ntu-undergraduate-enrolment.json, smu-undergraduate-enrolment.json.
Graduate Statistics - Historical graduate counts by university and degree type come from Ministry of Education's annual statistics, specifically the CSV file on graduates from university first degree courses. This provides context about cohort sizes over time. Source: GraduatesFromUniversityFirstDegreeCoursesByTypeOfCourseAndSexAnnual.csv from data.gov.sg.
Data Processing Pipeline
The raw data goes through a 5-stage processing pipeline implemented in TypeScript. Each stage transforms the data progressively, handling inconsistencies and building relationships between different datasets. The pipeline runs via the command `pnpm process-data` and takes approximately 3 seconds to complete.
Stage 1: Load - Reads all raw data files from app/data/raw/. This includes parsing JSON files for IGP and GPA data, and CSV files for GES and enrollment statistics. Basic validation ensures all expected fields are present. Any malformed records are logged but don't stop the pipeline.
Stage 2: Create Courses - Generates canonical course objects from A-level IGP data. Each course gets a unique slug (e.g., "nus-law", "ntu-engineering-mechanical") based on university and programme name. The system normalizes course names, removing special characters and standardizing formats. For courses using the legacy 90-point system in their IGP strings (like "AAA/A"), we parse the grades and convert to the 70-point scale.
Stage 3: Merge Additional Data - Combines polytechnic GPA cutoffs and intake numbers with the canonical courses. Matching is done by programme name after normalization. This stage also identifies courses that accept polytechnic students versus those that are A-level only. About 60% of courses have polytechnic entry routes.
Stage 4: Process GES Data - The most complex stage, matching employment survey results to courses. This is challenging because GES uses full degree names (e.g., "Bachelor of Business Administration") while IGP uses programme names (e.g., "Business"). The system tries three matching strategies: First, manual overrides defined in ges-overrides.ts for 54+ known mismatches. Second, exact matching after normalization. Third, fuzzy matching using Levenshtein distance for close matches. The pipeline also detects and handles variants like "direct" vs "non-direct" honours programmes, storing them separately. Currently achieves about 70% match rate between courses and GES data.
Stage 5: Generate Outputs - Creates five JSON files in app/data/processed/. The main courses.json contains all 284 courses keyed by slug. Time-series.json has historical GES data from 2013-2025 for trend analysis. Search-index.json is optimized for autocomplete with just names and IDs. Faculty-structure.json organizes courses hierarchically by university and faculty. Course-lists.json contains pre-sorted arrays for common queries (by COP, salary, employment rate). A processing-report.json documents any warnings or errors encountered.
90-Point to 70-Point Conversion
Singapore changed its A-level scoring system in 2024, moving from 90 points maximum to 70 points. Since historical IGP data uses the old system, we need to convert it for fair comparison with current scores. The conversion happens in lib/grade-conversion.ts, which parses grade strings and recalculates points.
For a grade string like "AAA/A" (three H2 subjects plus one H1), we parse each grade and assign points: A=20, B=17.5, C=15, D=12.5, E=10 for H2 subjects, with H1 subjects worth half. We assume a GP (General Paper) grade of C for all historical data, as GP scores weren't included in published IGP. The total is calculated as: (3 H2 subjects + 1 H1 subject + GP) out of a possible 80 points, then scaled to 70: (score / 80) × 70. For example, "AAA/A" becomes (60 + 10 + 7.5) / 80 × 70 = 67.8 rank points.
This conversion involves assumptions and isn't perfect. The actual GP grades of admitted students likely varied, and the old system included Project Work while the new one doesn't. However, it provides a reasonable approximation for comparing historical cutoffs with current requirements. Users should treat converted values as estimates, especially for borderline cases.
GES Matching Process
Matching Graduate Employment Survey data to specific courses is complex because the survey uses different naming conventions than university admissions data. For example, NUS Law appears as "Bachelor of Laws" in GES but simply "Law" in IGP. SMU's "Business Management" in admissions becomes "Bachelor of Business Management" in the survey. These mismatches would leave most courses without employment data if not handled carefully.
The system uses a three-tier matching approach implemented in stage4-ges.ts. First priority goes to manual overrides defined in data/overrides/ges-overrides.ts. These handle known problem cases like Medicine, double degree programmes, and courses with year-to-year naming changes. The override system supports different match modes: 'contains' for partial matches, 'prefix' for start-of-string matches, and 'exact' for precise matching after normalization. It can also exclude certain variants (like "cum laude" programmes) and aggregate multiple GES records for umbrella programmes.
When no override exists, the system attempts exact matching after normalizing both course and GES degree names. Normalization removes articles, standardizes spacing, and handles common abbreviations. If exact matching fails, fuzzy matching using Levenshtein distance finds close matches within a threshold of 5 character edits. All matches are tagged with confidence levels (high, medium, low) and the method used (override, exact, fuzzy) for transparency.
Data Freshness
The site uses the latest available data from official sources. Admissions data (IGP and polytechnic GPA) reflects the 2024/2025 academic year intake, published by universities in mid-2024. Graduate Employment Survey data includes results through the 2025 survey (conducted in 2024 for 2024 graduates), published in February 2025. This means employment statistics are about 6 months behind graduation, while admissions data is about 1 year behind the current application cycle.
Updates happen annually when new data becomes available. GES results typically release in February/March, while university admissions data publishes around July/August after each intake. We process new data within weeks of official release. All processed data is committed to the Git repository rather than fetched at build time. That keeps data consistent across deployments and lets us track changes over time.
Known Limitations
GES Coverage Gaps - Not all courses appear in the Graduate Employment Survey. Some programmes are too new, others have cohorts too small for statistical reliability. Currently about 70% of courses have GES data. Courses without employment data show "No GES data available" rather than estimates. Professional programmes like Medicine and Law may have different survey timings due to longer training periods.
Suppressed Data - The government suppresses statistics when cohort sizes are small (typically under 30 graduates) to protect privacy. This affects specialized programmes and newer courses. Suppressed data appears as null values in our dataset. We don't interpolate or estimate these missing values.
Historical Conversion Assumptions - Converting old 90-point IGP to the current 70-point system requires assumptions about GP grades and the role of Project Work. The conversion formula assumes all students had GP grade C, which may overstate or understate actual requirements. Users should treat historical IGP as approximate, especially for competitive courses where small differences matter.
University Coverage Variance - NUS, NTU, and SMU have comprehensive data across all categories. SIT, SUSS, and SUTD have limited or no IGP data as they use different admission criteria or are newer institutions. These universities appear in GES data but may lack admissions statistics. This reflects data availability from official sources, not selective coverage on our part.
Feedback and Corrections
Data accuracy depends on both our processing and the quality of source data. If you notice errors - whether incorrect IGP scores, mismatched courses, or outdated information - please report them via our feedback form. Include the specific course and data point in question. We investigate all reports and apply corrections in the next update cycle.
Related Pages
About Universities.sg - Our mission and what we offer
Rank Point Calculator - Calculate your score and see matching courses