Skip to the content.

Speech-to-SQL (S2SQL) aims to convert spo002 ken questions into SQL queries given relational databases, which has been traditionally implemented in a cascaded manner while facing the following challenges: 1) model training is faced with the major issue of data scarcity, where limited parallel data is available; and 2) the systems should differ from the source data. In this work, we propose the first direct speech-to-SQL parsing model Wav2SQL which avoids error compounding across cascaded systems. Specifically, 1) to accelerate speech-driven SQL parsing research in the community, we release a large-scale and multi-speaker dataset MASpider; 2) leveraging the recent progress in the large-scale pre-training, we show that it alleviates the data scarcity issue and allow for di020 rect speech-to-SQL parsing; and 3) we include the speech re-programming and gradient reversal classifier techniques to reduce acoustic variance and learned style-agnostic representa024 tion, improving generalization to unseen out-of-domain custom data. Experimental results demonstrate that Wav2SQL avoids error compounding and achieves state-of-the-art results by up to 4.6% accuracy improvement over the baseline.

Audio samples are available at https://Wav2SQL.github.io/.


Wav2SQL: Direct Generalizable Speech-To-SQL Parsing

Britian Accent


Spoken question Text question Target SQL
Male Sample 1: What are the names of projects that require more than 300 hours, and how many scientists are assigned to each? SELECT count(*), T1.name FROM projects AS T1 JOIN assignedto AS T2 ON T1.code = T2.project WHERE T1.hours > 300 GROUP BY T1.name
Male Sample 2: Find the number of projects which each scientist is working on and scientist's name. SELECT count(*), T1.name FROM scientists AS T1 JOIN assignedto AS T2 ON T1.ssn = T2.scientist GROUP BY T1.name
Female Sample 1: Find the emails of customers who has filed a complaints of the product with the most complaints. SELECT t1.email_address FROM customers AS t1 JOIN complaints AS t2 ON t1.customer_id = t2.customer_id GROUP BY t1.customer_id ORDER BY count(*) LIMIT 1
Female Sample 2: Which products has been complained by the customer who has filed least amount of complaints? SELECT DISTINCT t1.product_name FROM products AS t1 JOIN complaints AS t2 ON t1.product_id = t2.product_id JOIN customers AS t3 GROUP BY t3.customer_id ORDER BY count(*) LIMIT 1


Japan Accent


Spoken question Text question Target SQL
Male Sample 1: What are the carriers of devices whose software platforms are not "Android"? SELECT Carrier FROM device WHERE Software_Platform != 'Android'
Male Sample 2: What are the names of shops in ascending order of open year? SELECT Shop_Name FROM shop ORDER BY Open_Year ASC
Female Sample 1: How many friends does Dan have? SELECT count(T2.friend) FROM Person AS T1 JOIN PersonFriend AS T2 ON T1.name = T2.name WHERE T1.name = 'Dan'
Female Sample 2: How many females does this network has? SELECT count(*) FROM Person WHERE gender = 'female'


Thailand Accent


Spoken question Text question Target SQL
Male Sample 1: What is the forename and surname of the driver with the shortest laptime? SELECT T1.forename , T1.surname FROM drivers AS T1 JOIN laptimes AS T2 ON T1.driverid = T2.driverid ORDER BY T2.milliseconds LIMIT 1
Male Sample 2: What is the id and family name of the driver who has the longest laptime? SELECT T1.driverid , T1.surname FROM drivers AS T1 JOIN laptimes AS T2 ON T1.driverid = T2.driverid ORDER BY T2.milliseconds DESC LIMIT 1
Female Sample 1: Find the buildings which have rooms with capacity more than 50. SELECT DISTINCT building FROM classroom WHERE capacity > 50
Female Sample 2: Count the number of rooms that are not in the Lamberton building. SELECT count(*) FROM classroom WHERE building != 'Lamberton'


China Accent


Spoken question Text question Target SQL
Male Sample 1: how long is rio grande SELECT LENGTH FROM river WHERE river_name = "rio grande"
Male Sample 2: how long is the longest river in texas SELECT LENGTH FROM river WHERE LENGTH = ( SELECT MAX ( LENGTH ) FROM river WHERE traverse = "texas" ) AND traverse = "texas"
Female Sample 1: How many papers did michael i. jordan publish in 2016 ? SELECT DISTINCT COUNT ( t2.paperid ) FROM writes AS t2 JOIN author AS t1 ON t2.authorid = t1.authorid JOIN paper AS t3 ON t2.paperid = t3.paperid WHERE t1.authorname = "michael i. jordan" AND t3.year = 2016
Female Sample 2: How many papers did michael i. jordan publish in 2016 ? SELECT DISTINCT COUNT ( t2.paperid ) FROM writes AS t2 JOIN author AS t1 ON t2.authorid = t1.authorid JOIN paper AS t3 ON t2.paperid = t3.paperid WHERE t1.authorname = "michael i. jordan" AND t3.year = 2016


America Accent


Spoken question Text question Target SQL
Male Sample 1: How many heads of the departments are older than 56 ? SELECT count(*) FROM head WHERE age > 56
Male Sample 2: List the name, born state and age of the heads of departments ordered by age. SELECT name, born_state, age FROM head ORDER BY age
Female Sample 1: Give me the dates when the max temperature was higher than 85. SELECT date FROM weather WHERE max_temperature_f > 85
Female Sample 2: What are the names of stations that have latitude lower than 37.5? SELECT name FROM station WHERE lat < 37.5


Korea Accent


Spoken question Text question Target SQL
Female Sample 1: How many tracks do we have? SELECT count(*) FROM track
Female Sample 2: Show the name and location for all tracks. SELECT name, LOCATION FROM track