Comments (7)
Possible problems on SET statements while importing data, further investigation needed.
Relevant:
LOAD DATA interprets an empty field value differently from a missing field:
- For string types, the column is set to the empty string.
- For numeric types, the column is set to 0.
- For date and time types, the column is set to the appropriate “zero” value for the type. See Section 11.2, “Date and Time Data Types”.
https://dev.mysql.com/doc/refman/8.0/en/load-data.html
https://dev.mysql.com/doc/refman/8.0/en/datetime.html
from gtfs-bench.
This is the origin of the difference:
Lines 280 to 291 in 205ce1a
But I can't see how the 0 become NULL; it should be 0 and only NULL if exact_times
is an empty value... maybe some problem importing the CSV file to MySQL? Needs further investigation.
from gtfs-bench.
The problem is in the input dataset, the value of exact_time is always NULL, bc it's optional in the GTFS model and our dataset does not provide that information. What I don't understand is why the output contains 0...
from gtfs-bench.
The problem is in the input dataset, the value of exact_time is always NULL, bc it's optional in the GTFS model and our dataset does not provide that information. What I don't understand is why the output contains 0...
The data that VIG uses have NULL values:
DROP TABLE IF EXISTS `FREQUENCIES`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `FREQUENCIES` (
`trip_id` varchar(200) NOT NULL,
`start_time` varchar(200) NOT NULL,
`end_time` varchar(200) DEFAULT NULL,
`headway_secs` int(11) DEFAULT NULL,
`exact_times` int(11) DEFAULT '0',
PRIMARY KEY (`trip_id`,`start_time`),
CONSTRAINT `FREQUENCIES_ibfk_1` FOREIGN KEY (`trip_id`) REFERENCES `TRIPS` (`trip_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
/*!40101 SET character_set_client = @saved_cs_client */;
--
-- Dumping data for table `FREQUENCIES`
--
LOCK TABLES `FREQUENCIES` WRITE;
/*!40000 ALTER TABLE `FREQUENCIES` DISABLE KEYS */;
INSERT INTO `FREQUENCIES` VALUES ('4_I12-001_2016I12_1_1_4__1___','00:00:00','01:00:00',600,NULL),('4_I12-001_2016I12_1_1_4__1___','01:00:00','02:00:00',900,NULL),('4_I12-001_2016I12_1_1_4__1___','06:05:00','07:00:00',540,NULL)
[...]
And confirmed:
mysql> select * from FREQUENCIES LIMIT 10;
+-------------------------------+------------+----------+--------------+-------------+
| trip_id | start_time | end_time | headway_secs | exact_times |
+-------------------------------+------------+----------+--------------+-------------+
| 4_I12-001_2016I12_1_1_4__1___ | 00:00:00 | 01:00:00 | 600 | NULL |
| 4_I12-001_2016I12_1_1_4__1___ | 01:00:00 | 02:00:00 | 900 | NULL |
| 4_I12-001_2016I12_1_1_4__1___ | 06:05:00 | 07:00:00 | 540 | NULL |
| 4_I12-001_2016I12_1_1_4__1___ | 07:00:00 | 09:00:00 | 480 | NULL |
| 4_I12-001_2016I12_1_1_4__1___ | 09:00:00 | 18:00:00 | 480 | NULL |
| 4_I12-001_2016I12_1_1_4__1___ | 18:00:00 | 21:30:00 | 360 | NULL |
| 4_I12-001_2016I12_1_1_4__1___ | 21:30:00 | 23:00:00 | 390 | NULL |
| 4_I12-001_2016I12_1_1_4__1___ | 23:00:00 | 24:00:00 | 450 | NULL |
| 4_I12-001_2016I12_2_1_4__1___ | 00:00:00 | 01:00:00 | 600 | NULL |
| 4_I12-001_2016I12_2_1_4__1___ | 01:00:00 | 02:00:00 | 900 | NULL |
+-------------------------------+------------+----------+--------------+-------------+
10 rows in set (0.00 sec)
Maybe a VIG bug?
from gtfs-bench.
I'll make some tests locally
from gtfs-bench.
seems that the problem is here: SET exact_times = IF(exact_times='',NULL,exact_times);
from gtfs-bench.
if there are other issues regarding set in the loading but is not in exact_times (as it has been solved) please @daniel-dona open other issues
from gtfs-bench.
Related Issues (20)
- MySQL "LOCAL INFILE" import HOT 3
- Docker "--pull always" option HOT 3
- Improve output data compression HOT 8
- Mysql 8.0: Incorrect DATE value
- Fix shape_dist / shape_dist_traveled inconsistency
- url fixed columns are not mantained in the scaling-up with VIG HOT 2
- Include fixed jar from VIG HOT 1
- shape_dist_traveled not found in CSV HOT 3
- Table names in mysql mappings wrong HOT 3
- Mappings producing different number of results HOT 10
- Enable passing parameters via env vars or a config file HOT 4
- gtfs:zone is an object property in the ontology but data property in the mappings HOT 2
- Service-Calendar and Shape-shapePoints are joins without conditions HOT 1
- Queries with booleans in the triple patters do not produce result is ontop
- gtfs:distanceTraveled datatype
- Include PostgreSQL and Oracle schema SQL files when generating
- Change YARRRML translator to yatter HOT 4
- Include in the ontology all properties and classes
- Remove xsd:duration datatype from the mappings HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gtfs-bench.