No articles match
Introduction to data.table2 days ago
Data analysis using data.table | Data | Introduction | 1. Basics | a) What is data.table? | Note that: | b) General form - in what way is a data.table enhanced? | The way to read it (out loud) is: | c) Subset rows in i | -- Get all the flights with "JFK" as the origin airport in the month of June. | -- Get the first two rows from flights. | -- Sort flights first by column origin in ascending order, and then by dest in descending order: | order() is internally optimised | d) Select column(s) in j | -- Select arr_delay column, but return it as a vector. | -- Select arr_delay column, but return as a data.table instead. | Tip: | -- Select both arr_delay and dep_delay columns. | -- Select both arr_delay and dep_delay columns and rename them to delay_arr and delay_dep. | e) Compute or do in j | -- How many trips have had total delay < 0? | What's happening here? | f) Subset in i and do in j | -- Calculate the average arrival and departure delay for all flights with "JFK" as the origin airport in the month of June. | -- How many trips have been made in 2014 from "JFK" airport in the month of June? | g) Handle non-existing elements in i | -- What happens when querying for non-existing elements? | Special symbol .N: | h) Great! But how can I refer to columns by names in j (like in a data.frame)? | -- Select both arr_delay and dep_delay columns the data.frame way. | -- Select columns named in a variable using the .. prefix | -- Select columns named in a variable using with = FALSE | 2. Aggregations | a) Grouping using by | -- How can we get the number of trips corresponding to each origin airport? | -- How can we calculate the number of trips for each origin airport for carrier code "AA"? | -- How can we get the total number of trips for each origin, dest pair for carrier code "AA"? | -- How can we get the average arrival and departure delay for each orig,dest pair for each month for carrier code "AA"? | b) Sorted by: keyby | -- So how can we directly order by all the grouping variables? | c) Chaining | -- How can we order ans using the columns origin in ascending order, and dest in descending order? | d) Expressions in by | -- Can by accept expressions as well or does it just take columns? | e) Multiple columns in j - .SD | -- Do we have to compute mean() for each column individually? | Special symbol .SD: | -- How can we specify just the columns we would like to compute the mean() on? | .SDcols | f) Subset .SD for each group: | -- How can we return the first two rows for each month? | g) Why keep j so flexible? | -- How can we concatenate columns a and b for each group in ID? | -- What if we would like to have all the values of column a and b concatenated, but returned as a list column? | Summary | Using i: | Using j: | Using by: | And remember the tip:
Bugs Fixed in Spatstat9 days ago
Summary of Recent Updates to the Spatstat Family9 days ago
Reference semantics9 days ago
Data | Introduction | 1. Reference semantics | a) Background | shallow vs deep copy | b) The := operator | 2. Add/update/delete columns by reference | a) Add columns by reference | -- How can we add columns speed and total delay of each flight to flights data.table? | Note that | b) Update some rows of columns by reference - sub-assign by reference | -- Replace those rows where hour == 24 with the value 0 | Exercise: | c) Delete column by reference | -- Remove delay column | d) := along with grouping using by | -- How can we add a new column which contains for each orig,dest pair the maximum speed? | Note on zero-length RHS and by | e) Multiple columns and := | -- How can we add two more columns computing max() of dep_delay and arr_delay for each month, using .SD? | -- How can we update multiple existing columns in place using .SD? | 3. := and copy() | a) := for its side effect | b) The copy() function | c) Selecting columns: $ / [[...]] vs [, col] | Summary | The := operator
Importing data.table15 days ago
Why to import data.table | Importing data.table is easy | DESCRIPTION file | NAMESPACE file | Usage | Testing | Testing using testthat | Dealing with "undefined global functions or variables" | Care needed when providing and using options | Troubleshooting | License | Optionally import data.table: Suggests | data.table in Imports but nothing imported | Further information on dependencies | Importing data.table C routines | How to convert your Depends dependency on data.table to Imports | Step 0. Ensure your package is passing R CMD check initially | Step 1. Update the DESCRIPTION file to put data.table in Imports, not Depends | Step 2.1: Run R CMD check | Step 2.2: Modify the NAMESPACE file | Blanket import | Step 3: Update Your R code files outside the package's R/ directory | Benefits of using Imports
Add charts to a workbook16 days ago
Add plot to workbook | Add {ggplot2} plot to workbook | Add plot via | Adding {encharter} plots | Add and fill a chartsheet | Add {mschart} plots
Intro to r5r: Rapid Realistic Routing with R5 in R21 days ago
1. Introduction | 2. Installation | 3. Usage | 3.1 Data requirements: | 4. Demonstration on sample data | Data | 4.1 Building routable transport network with build_network() | 4.2 Accessibility analysis | 4.3 Routing analysis | Fast many to many travel time matrix | Expanded travel time matrix with minute-by-minute estimates | Detailed itineraries | Visualize results | Cleaning up after usage
Accessibility22 days ago
1. Introduction | 2. Build routable transport network with build_network() | Increase Java memory and load libraries | 3. Accessibility: quick and easy approach | 4. Accessibility: flexible approach | 5. Map Accessibility | 5.1 Choropleth maps | 5.2 Spatial interpolation | Cleaning up after usage | References
Trip planning with detailed_itineraries()22 days ago
1. Introduction | 2. Build routable transport network with build_network() | 3. Detailed info by trip segment for multiple trip alternatives | 3.1 Visualize results | 4. A few options: | 4.1 Combining orings and destinations | 4.2 Keep geometry data in the output | 5. Hack for frequency-based GTFS feeds | Cleaning up after usage
Introduction to gtfsio22 days ago
Installation | Basic usage | Reading feeds | Writing feeds | Checking GTFS objects
collapse Documentation and Resources22 days ago
Built-In Structured Documentation | DeepWiki | JSS Article | Cheatsheet | Vignettes | Blog | Presentations and Slides
Developing odbc29 days ago
Posit Professional Drivers | Drivers | Databases | PostgreSQL | MySQL | SQL Server test setup | SQLite | Oracle | On MacOS | On Linux | Snowflake | Amazon Redshift | RODBC
Additional vignettes2 months ago
Vignettes for individual functions | General vignettes | Publication | Presentation | Statistical backend of | Suggestions
The curl package: a modern R interface to libcurl2 months ago
Request interfaces | Getting in memory | Downloading to disk | Streaming data | Non blocking connections | Async requests | Exception handling | Error automatically | Check manually | Customizing requests | Setting handle options | ENUM (long) options | Disabling HTTP/2 | Performing the request | Reading cookies | On reusing handles | Posting forms | Using pipes
Visualizations with statistical details: The 'ggstatsplot' approach2 months ago
Summary | Statement of Need | Benefits | Future Scope | Licensing and Availability | Acknowledgements | References
showtext: Using Fonts More Easily in R Graphs3 months ago
Introduction | A Quick Example | The Usage | Loading Fonts | Working with R Markdown | CJK Fonts | How showtext Works | Compatibility with RStudio
An Introduction to Polars from R3 months ago
What is Polars? | Documentation and tutorials | Series and DataFrames | Methods and pipelines | Subset | Aggregate and modify | Reshape | Join | Lazy execution | Data import | Execute R functions within a Polars query | Data types
Using custom functions or other R packages3 months ago
Writing functions using polars expressions | Using purrr | Conclusion
A Future for R: A Comprehensive Overview3 months ago
A Future for R: Best Practices for Package Developers3 months ago
A Future for R: Common Issues with Solutions3 months ago
A Future for R: Future Topologies3 months ago
A Future for R: How the Future Framework is Validated3 months ago
A Future for R: Non-Exportable Objects3 months ago
A Future for R: Text and Message Output3 months ago
Big IPUMS Data3 months ago
Setup | Option 1: Trade money for convenience | Option 2: Reduce extract size | Remove unused data | Select cases | Use a sampled subset of the data | Option 3: Process the data in pieces | Reading chunked data | Chunked tabulation | Chunked regression | Reading yielded data | Yielded tabulation | Yielded GLM regression | Option 4: Use a database | Importing data into the database | Connecting to a database with dbplyr | Learning more
Installation details3 months ago
How to install | From R-multiverse (recommended) | From GitHub | Details of installation | Pre-built Rust library binaries | Rust build time options | Features | Profile | Minimum Supported Rust Version (MSRV) | Builds for WASM/Emscripten
Depending on a development version3 months ago
The Remotes field | GitHub | Other sources | CRAN submission
Reading, Writing and Converting Simple Features 3 months ago
Reading and writing through GDAL | Using st_read | Using st_write | Guessing a driver for output | Dataset and layer reading or creation options | Reading and writing directly to and from spatial databases | Conversion to other formats: WKT, WKB, sp | Conversion to and from well-known text | Conversion to and from well-known binary | Conversion to and from sp
Manipulating Simple Feature Geometries 3 months ago
Type transformations | For single geometries | For collections of geometry (sfc) and simple feature collections (sf) | Affine transformations | Coordinate reference systems conversion and transformation | Getting and setting coordinate reference systems of sf objects | Coordinate reference system transformations | Geometrical operations | Unary operations | Binary operations: distance and relate | Binary logical operations: | Operations returning a geometry
Manipulating Simple Features 3 months ago
Subsetting feature sets | Aggregating or summarizing feature sets | Joining two feature sets based on attributes | Joining two feature sets based on geometries
Plotting Simple Features 3 months ago
Plot methods for sf and sfc objects | Geometry only: sfc | Geometry with attributes: sf | Color key place and size | Class intervals | How does sf project geographic coordinates? | Graticules | Plotting sf objects with other packages | grid: st_as_grob | ggplot2 | mapview | tmap
The magick package: Advanced Image-Processing in R3 months ago
Installing magick | Build from source | Image IO | Read and write | Converting formats | Preview | Transformations | Cut and edit | Filters and effects | Kernel convolution | Text annotation | Combining with pipes | Image Vectors | Layers | Combining | Pages | Animation | Drawing and Graphics | Graphics device | Drawing device | Animated Graphics | Raster Images | Base R rasters | The grid package | OCR text extraction
A Common Database Interface (DBI)4 months ago
Version | Introduction | DBI Classes and Methods | Class DBIObject | Class DBIDriver | Class DBIConnection | Class DBIResult | Data Type Mappings | Utilities | Open Issues and Limitations | Resources
A Common Interface to Relational Databases from R and S -- A Proposal4 months ago
Computing with Distributed Data | A Common Interface | Interface Classes | Class dbManager | Class dbConnection | Class dbResult | Class dbResultSet | Data Type Mappings | Open Issues | Limitations | Other Approaches | Open Database Connectivity (ODBC) | Java Database Connectivity (JDBC) | CORBA and a 3-tier Architecture | Resources | Acknowledgements | The S Version 4 Definitions
Advanced DBI Usage4 months ago
Who this tutorial is for | How to run more complex queries using DBI | How to read part of a table from a database | How to use parameters (safely) in SQL queries | Quoting | Parameterized queries | SQL data manipulation - UPDATE, DELETE and friends | SQL transactions with DBI | Conclusion
Simple Features for R 4 months ago
What is a feature? | Dimensions | Simple feature geometry types | Coordinate reference system | How simple features in R are organized | sf: objects with simple features | sfc: simple feature geometry list-column | Mixed geometry types | sfg: simple feature geometry | Well-known text, well-known binary, precision | WKT and WKB | Precision | Reading and writing | Driver-specific options | Create, read, update and delete | Connection to spatial databases | Coordinate reference systems and transformations | Conversion, including to and from sp | Geometrical operations | Non-simple and non-valid geometries | Units | How attributes relate to geometries
Implementing a Work Queue using RPostgres4 months ago
LISTEN / NOTIFY | SKIP LOCKED | Implementing our worker
Introduction to the IPUMS API for R Users4 months ago
API availability | Set up your API key | Define an extract request | Extract request objects | Submit an extract request | Wait for an extract request to complete | Download an extract | Get info on past extracts | Share an extract definition | Revise a previous extract request | Putting it all together
IPUMS Data and R4 months ago
Obtaining IPUMS data | Obtaining data via an IPUMS project website | Downloading from microdata projects | Downloading from aggregate data projects | Obtaining data via the IPUMS API | Extract support | Metadata support | Workflow | Reading IPUMS data | Microdata files | NHGIS files | Spatial boundary files | Ancillary files | Exploring file metadata | Labelled values
Microdata API Requests4 months ago
Supported microdata collections | Basic IPUMS microdata concepts | IPUMS microdata metadata | Defining an IPUMS microdata extract request | Detailed variable specifications | Syntax | Case selections | Attached characteristics | Data quality flags | Monetary value adjustment | Time use variables | Data structure | Data file format | Next steps
Add styling to a workbook4 months ago
Styling showcase | Colors, text rotation and number formats | the quick way: using high-level functions | the long way: using bare-metal functions | Working with number formats | numfmts | numfmts2 | Modifying the column widths | wb_set_col_widths | Adding borders | add borders | styled table | the long way: creating everything from the bone | Use workbook colors and modify them | Copy cell styles | Style strings | Create custom table styles
openxlsx2 basic manual4 months ago
First steps | Handling workbooks | Importing as workbook | Exporting data | Exporting data frames or vectors | Exporting a wbWorkbook | dims/ wb_dims() | A note on speed and memory usage
openxlsx2 read to data frame4 months ago
Importing data | Basic import | col_names - first row as column name | detect_dates - convert cells to R dates | show_formula - show formulas instead of results | dims - read specific dimension | cols - read selected columns | rows - read selected rows | convert - convert input to guessed type | skip_empty_rows - remove empty rows | skip_empty_cols - remove empty columns | row_names - keep rownames from input | types - convert column to specific type | start_row - where to begin | na - define missing values
Fast Read and Fast Write5 months ago
1. fread() | 1.1 Using command line tools directly | 1.1.1 Reading directly from a text string | 1.1.2 Reading from URLs | 1.1.3 Automatic decompression of compressed files | 1.2 Automatic separator and skip detection | 1.3 High-Quality Automatic Column Type Detection | 1.4 Early Error Detection at End-of-File | 1.5 integer64 Support | 1.6 Drop or Select Columns by Name or Position | 1.7 Automatic Quote Escape Detection (Including No-Escape) | 2. fwrite() | 2.1 Intelligent and Minimalist Quoting (quote="auto") | 2.2 Fine-Grained Date/Time Serialization (dateTimeAs argument) | 2.3 Handling of bit64::integer64 | 2.4 Column Order and Subset Control | 3. A Note on Performance
Datasets Provided for the Spatstat Package5 months ago
Joins in data.table5 months ago
1. Defining example data | 2. data.table joining syntax | 3. Equi joins | 3.1. Right join | 3.1.1. Joining by a list argument | 3.1.2. Alternatives to define the on argument | 3.1.3. Operations after joining | Managing shared column Names with the j argument | Summarizing with on in data.table | 3.1.4. Joining based on several columns | 3.2. Inner join | 3.3. Anti-join | 3.4. Semi join | 3.5. Left join | 3.5.1. Joining after chain operations | 3.6. Many to many join | 3.6.1. Selecting one match | 3.6.2. Cross join | 3.7. Full join | 4. Non-equi join | 4.1 Output column names in non-equi joins | 5. Rolling join | 6. Taking advantage of joining speed | 6.1. Subsets as joins | 6.2. Updating by reference | Reference
formatR5 months ago
1. Installation | 2. Reformat R code | 3. The Graphical User Interface | 4. Evaluate the code and mask output in comments | 5. Showcase | Substitute = with <- | Discard blank lines | Reindent code (2 spaces instead of 4) | Start function arguments on a new line | The pipe operators %>% and |> | Move left braces { to new lines | Do not wrap comments | Discard comments | 6. Further notes | Inline comments after an incomplete expression or ; | Inappropriate blank lines | ? with comments | 7. How does tidy_source() actually work? | 8. Global options
A Future for R: Available Future Backends5 months ago
A Future for R: Controlling Default Future Strategy5 months ago
A Future for R: Future API Backend Specification5 months ago
Spherical geometry in sf using s2geometry 5 months ago
Introduction | Projected and geographic coordinates | Fundamental differences | Polygons on $S^2$ divide the sphere in two parts | Semi-open polygon boundaries | Bounding cap, bounding rectangle | Switching between S2 and GEOS | Measures | Area | Length | Distances | Predicates | Transformations | Buffers | st_buffer or st_is_within_distance? | References
Using Skimr5 months ago
Introduction | The skim() function | Skimming data frames | Skimming vectors | Skimming matrices | Skimming without modification | Reshaping the results from skim() | Rendering the results of skim() | Customizing print options | Modifying skim() | Extending skimr | Solutions to common rendering problems
mapsf5 months ago
Introduction | Main Features | Symbology | Map Layout | Themes | Export | Examples of thematic maps | Base map | Proportional Symbols | Choropleth Map | Typology Map | Proportional Symbols using Choropleth Coloration | Proportional Symbols using Typology Coloration | Label Map | Links Map | Datasets
Keys and fast binary search based subset5 months ago
Data | Introduction | 1. Keys | a) What is a key? | Keys and their properties | b) Set, get and use keys on a data.table | -- How can we set the column origin as key in the data.table flights? | set* and :=: | -- Use the key column origin to subset all rows where the origin airport matches "JFK" | -- How can we get the column(s) a data.table is keyed by? | c) Keys and multiple columns | -- How can I set keys on both origin and dest columns? | -- Subset all rows using key columns where first key column origin matches "JFK" and second key column dest matches "MIA" | How does the subset work here? | -- Subset all rows where just the first key column origin matches "JFK" | -- Subset all rows where just the second key column dest matches "MIA" | What's happening here? | 2. Combining keys with j and by | a) Select in j | -- Return arr_delay column as a data.table corresponding to origin = "LGA" and dest = "TPA". | b) Chaining | -- On the result obtained above, use chaining to order the column in decreasing order. | c) Compute or do in j | -- Find the maximum arrival delay corresponding to origin = "LGA" and dest = "TPA". | d) sub-assign by reference using := in j | e) Aggregation using by | -- Get the maximum departure delay for each month corresponding to origin = "JFK". Order the result by month | 3. Additional arguments - mult and nomatch | a) The mult argument | -- Subset only the first matching row from all rows where origin matches "JFK" and dest matches "MIA" | -- Subset only the last matching row of all the rows where origin matches "LGA", "JFK", "EWR" and dest matches "XNA" | b) The nomatch argument | -- From the previous example, Subset all rows only if there's a match | 4. binary search vs vector scans | a) Performance of binary search approach | b) Why does keying a data.table result in blazing fast subsets? | Vector scan approach | Binary search approach | Summary
Programming on data.table5 months ago
Introduction | Problem description | Example | Approaches to the problem | Avoid lazy evaluation | Use of parse / eval | Computing on the language | Use third party packages | Substituting variables and names | Substitute functions | Substitute variables and character values | Substituting lists of arbitrary length | Substitution of a complex query | Common mistakes | Use env argument from inside another function | Retired interfaces | get | mget | eval
Secondary indices and auto indexing5 months ago
Data | Introduction | 1. Secondary indices | a) What are secondary indices? | Keyed vs. Indexed Subsetting | b) Set and get secondary indices | -- How can we set the column origin as a secondary index in the data.table flights? | -- How can we get all the secondary indices set so far in flights? | c) Why do we need secondary indices? | -- Reordering a data.table can be expensive and not always ideal | setkey() requires: | -- There can be only one key at the most | -- Secondary indices can be reused | -- The new on argument allows for cleaner syntax and automatic creation and reuse of secondary indices | on argument | 2. Fast subsetting using on argument and secondary indices | a) Fast subsets in i | -- Subset all rows where the origin airport matches "JFK" using on | -- How can I subset based on origin and dest columns? | b) Select in j | -- Return arr_delay column alone as a data.table corresponding to origin = "LGA" and dest = "TPA" | c) Chaining | -- On the result obtained above, use chaining to order the column in decreasing order. | d) Compute or do in j | -- Find the maximum arrival delay corresponding to origin = "LGA" and dest = "TPA". | e) sub-assign by reference using := in j | f) Aggregation using by | -- Get the maximum departure delay for each month corresponding to origin = "JFK". Order the result by month | g) The mult argument | -- Subset only the first matching row where dest matches "BOS" and "DAY" | -- Subset only the last matching row where origin matches "LGA", "JFK", "EWR" and dest matches "XNA" | h) The nomatch argument | -- From the previous example, subset all rows only if there's a match | 3. Auto indexing
Benchmarking data.table6 months ago
fread: clear caches | subset: threshold for index optimization on compound queries | subset: index aware benchmarking | by reference operations | try to benchmark atomic processes | avoid class coercion | avoid microbenchmark(..., times=100) | multithreaded processing | inside a loop prefer set instead of := | inside a loop prefer setDT instead of data.table()
Efficient reshaping using data.tables6 months ago
Data | Introduction | 1. Default functionality | a) melting data.tables (wide to long) | - Convert DT to long form where each dob is a separate observation. | - Name the variable and value columns to child and dob respectively | b) dcasting data.tables (long to wide) | - How can we get back to the original data table DT from DT.m1? | - Starting from DT.m1, how can we get the number of children in each family? | 2. Limitations in previous melt/dcast approaches | Issues | 3. Enhanced (new) functionality | a) Enhanced melt | - melt multiple columns simultaneously | - Using patterns() | - Using measure() to specify measure.vars via separator or pattern | b) Enhanced dcast | - Casting multiple value.vars simultaneously | Multiple functions to fun.aggregate:
Frequently Asked Questions about data.table6 months ago
Beginner FAQs | Why do DT[ , 5] and DT[2, 5] return a 1-column data.table rather than vectors like data.frame? | Why does DT[,"region"] return a 1-column data.table rather than a vector? | Why does DT[, region] return a vector for the "region" column? I'd like a 1-column data.table. | Why does DT[ , x, y, z] not work? I wanted the 3 columns x,y and z. | I assigned a variable mycol="x" but then DT[, mycol] returns an error. How do I get it to look up the column name contained in the mycol variable? | What are the benefits of being able to use column names as if they are variables inside DT[...]? | OK, I'm starting to see what data.table is about, but why didn't you just enhance data.frame in R? Why does it have to be a new package? | Why are the defaults the way they are? Why does it work the way it does? | Isn't this already done by with() and subset() in base? | Why does X[Y] return all the columns from Y too? Shouldn't it return a subset of X? | What is the difference between X[Y] and merge(X, Y)? | Anything else about X[Y, sum(foo*bar)]? | That's nice. How did you manage to change it given that users depended on the old behaviour? | General Syntax | How can I avoid writing a really long j expression? You've said that I should use the column names, but I've got a lot of columns. | Why is the default for mult now "all"? | I'm using c() in j and getting strange results. | I have built up a complex table with many columns. I want to use it as a template for a new table; i.e., create a new table with no rows, but with the column names and types copied from my table. Can I do that easily? | Is a null data.table the same as DT[0]? | Why has the DT() alias been removed? | But my code uses j = DT(...) and it works. The previous FAQ says that DT() has been removed. | What are the scoping rules for j expressions? | Can I trace the j expression as it runs through the groups? | Inside each group, why are the group variables length-1? | Only the first 10 rows are printed, how do I print more? | With an X[Y] join, what if X contains a column called "Y"? | X[Z[Y]] is failing because X contains a column "Y". I'd like it to use the table Y in calling scope. | Can you explain further why data.table is inspired by A[B] syntax in base? | Can base be changed to do this then, rather than a new package? | I've heard that data.table syntax is analogous to SQL. | What are the smaller syntax differences between data.frame and data.table | I'm using j for its side effect only, but I'm still getting data returned. How do I stop that? | Why does [.data.table now have a drop argument from v1.5? | Rolling joins are cool and very fast! Was that hard to program? | Why does DT[i, col := value] return the whole of DT? I expected either no visible value (consistent with <-), or a message or return value containing how many rows were updated. It isn't obvious that the data has indeed been updated by reference. | OK, thanks. What was so difficult about the result of DT[i, col := value] being returned invisibly? | Why do I have to type DT sometimes twice after using := to print the result to console? | I've noticed that base::cbind.data.frame (and base::rbind.data.frame) appear to be changed by data.table. How is this possible? Why? | I've read about method dispatch (e.g. merge may or may not dispatch to merge.data.table) but how does R know how to dispatch? Are dots significant or special? How on earth does R know which function to dispatch and when? | Why do T and F behave differently from TRUE and FALSE in some data.table queries? | Questions relating to compute time | I have 20 columns and a large number of rows. Why is an expression of one column so quick? | I don't have a key on a large table, but grouping is still really quick. Why is that? | Why is grouping by columns in the key faster than an ad hoc by? | What are primary and secondary indexes in data.table? | Error messages | "Could not find function DT" | "unused argument(s) (MySum = sum(v))" | "translateCharUTF8 must be called on a CHARSXP" | cbind(DT, DF) returns a strange format, e.g. Integer,5 | "cannot change value of locked binding for .SD" | "cannot change value of locked binding for .N" | Warning messages | "The following object(s) are masked from package:base: cbind, rbind" | "Coerced numeric RHS to integer to match the column's type" | Reading data.table from RDS or RData file | General questions about the package | v1.3 appears to be missing from the CRAN archive? | Is data.table compatible with S-plus? | Is it available for Linux, Mac and Windows? | I think it's great. What can I do? | I think it's not great. How do I warn others about my experience? | I have a question. I know the r-help posting guide tells me to contact the maintainer (not r-help), but is there a larger group of people I can ask? | Where are the datatable-help archives? | I'd prefer not to post on the Issues page, can I mail just one or two people privately? | I have created a package that uses data.table. How do I ensure my package is data.table-aware so that inheritance from data.frame works?
Using .SD for Data Analysis6 months ago
What is .SD? | Loading and Previewing Lahman Data | .SD on Ungrouped Data | Column Subsetting: .SDcols | Column Type Conversion | Controlling a Model's Right-Hand Side | Conditional Joins | Grouped .SD operations | Group Subsetting | Group Optima | Grouped Regression
collapse and data.table6 months ago
Overview of Both Packages | Interoperating and some Do's and Dont's | Further collapse features supporting data.table's | Additional Benchmarks | References
collapse and dplyr6 months ago
1. Fast Aggregations | 1.1 Simple Aggregations | Excursus: What is Happening Behind the Scenes? | 1.2 More Speed using collapse Verbs | 1.3 Multi-Function Aggregations | 1.4 Weighted Aggregations | 2. Fast Transformations | 2.1 Fast Transform and Compute Variables | 2.2 Replacing and Sweeping out Statistics | 2.3 More Control using the TRA Function | 2.4 Faster Centering, Averaging and Standardizing | 2.5 Lags / Leads, Differences and Growth Rates | 3. Benchmarks | 3.1 Data | 3.1 Selecting, Subsetting, Ordering and Grouping | 3.1 Aggregation | 3.2 Transformation | References
collapse and plm6 months ago
Part 1: Fast Transformation of Panel Data | 1.1 Between and Within Transformations | 1.2 Higher-Dimensional Between and Within Transformations | 1.3 Scaling and Centering | 1.4 Panel Lags / Leads, Differences and Growth Rates | 1.5 Panel Data to Array Conversions | Benchmarks | Part 2: Fast Exploration of Panel Data | 2.1 Variation Check for Panel Data | 2.2 Summary Statistics for Panel Data | 2.3 Exploring Panel Data in Matrix / Array Form | 2.4 Panel- Auto-, Partial-Auto and Cross-Correlation Functions | 2.5 Testing for Individual Specific and Time-Effects | Part 3: Programming Panel Data Estimators | References
collapse and sf6 months ago
Summarising sf Data Frames | Selecting Columns and Subsetting | Aggregation and Grouping | Indexing | Unique Values, Ordering, Splitting, Binding | Transformations | Conversion to and from sf | Support for units | Conclusion
collapse for tidyverse Users6 months ago
Namespace and Global Options | Using the Fast Statistical Functions | Writing Efficient Code | Using Internal Grouping | Conclusion
collapse's Handling of R Objects6 months ago
Overview | General Principles | Specific Functions and Classes | Object Conversions | Selecting Columns by Data Type | Parsing of Time-IDs | xts/zoo Time Series | Support for sf and units | Support for data.table | Class-Agnostic Grouped and Indexed Data Frames | Conclusion
Developing with collapse6 months ago
Introduction | Point 1: Be Minimalistic in Computations | Point 2: Think About Memory and Optimize | Point 3: Internally Favor Primitive R Objects and Functions | Some Notes on Global Options | Conclusion
Introduction to collapse6 months ago
Why collapse? | 1. Data and Summary Tools | 1.1 wlddev - World Bank Development Data | 1.2 GGDC10S - GGDC 10-Sector Database | 2. Fast Data Manipulation | 2.1 Selecting and Replacing Columns | 2.2 Subsetting | 2.3 Reordering Rows and Columns | 2.4 Transforming and Computing New Columns | 2.5 Adding and Binding Columns | 2.6 Renaming Columns | 2.7 Using Shortcuts | 2.8 Missing Values / Rows | 2.9 Unique Values / Rows | 2.10 Recoding and Replacing Values | 3. Quick Data Object Conversions | 4. Advanced Statistical Programming | 4.1 Fast (Grouped, Weighted) Statistical Functions | 4.2 Factors, Grouping Objects and Grouped Data Frames | 4.3 Grouped and Weighted Computations | 4.4 Transformations Using the TRA Argument | 5. Advanced Data Aggregation | 6. Data Transformations | 6.1 Row and Column Arithmetic | 6.1 Row and Column Data Apply | 6.2 Split-Apply-Combine Computing | 6.3 Fast (Grouped) Replacing and Sweeping-out Statistics | 6.4 Fast Standardizing | 6.5 Fast Centering and Averaging | 6.6 HD Centering and Linear Prediction | 7. Time Series and Panel Series | 7.1 Panel Series to Array Conversions | 7.2 Panel Series ACF, PACF and CCF | 7.3 Fast Lags and Leads | 7.4 Fast Differences and Growth Rates | 8. List Processing and a Panel-VAR Example | 8.1 List Search and Identification | 8.2 List Subsetting | 8.3 Recursive Apply and Unlisting in 2D | Going Further | References
Translations7 months ago
Data types | Verbs | Functions within verbs | Parentheses | Comparison operators | Basic arithmetics | Math functions | Logical operators | Branching and conversion | String manipulation | Date manipulation | Aggregation | Shifting | Ranking | Special cases | Contributing | Known incompatibilities | Output order stability | sum() | Empty vectors in aggregate functions | min() and max() for logical input | n_distinct() and multiple arguments | is.na() and NaN values | Row names | Other differences
Isochrones7 months ago
1. Introduction | 2. Build routable transport network with build_network() | Increase Java memory and load libraries | 3. Calculating and visualizing isochrones | 3.1 Polygon-based isochrones | 3.1 Line-based isochrones | Cleaning up after usage
Add conditional formatting to a workbook7 months ago
Rule applies to all each cell in range | Highlight row dependent on first cell in row | Highlight column dependent on first cell in column | Highlight cell dependent on | Highlight entire range cols X rows dependent only on cell A1 | Highlight cells in column 1 based on value in column 2 | Highlight duplicates using default style | Cells containing text | Cells not containing text | Cells begins with text | Cells ends with text | Colorscale colors cells based on cell value | Databars | Between | Top N | Bottom N | Logical Operators
Upgrade from openxlsx7 months ago
Basic read and write functions | Read xlsx or xlsm files | Write xlsx files | Basic workbook functions | Loading a workbook | Styles | Conditional formatting | Data validation | Saving | Why openxlsx2? | Invitation to contribute
Definition of a gtsummary Object7 months ago
Introduction | Structure of a {gtsummary} object | table_body | table_styling | Constructing a {gtsummary} object | Printing a {gtsummary} object | The .$cards object
Third-party integrations8 months ago
Check your post8 months ago
Best practice | URL validity
httr29 months ago
Create a request | Perform a request and fetch the response | Control the request process
Memory protection: controlling automatic materialization9 months ago
Introduction | Eager and lazy computation | Example | Comparison | Prudence | Concept | Enforcing DuckDB operation | From stingy to lavish | Thrift | File ingestion and custom limits | Conclusion
Overview of Vignettes9 months ago
Function Overview | Description of Parameters | Formatting, Printing and Plotting | Dimension Reduction and Clustering | Plotting Functions
Accounting for monetary costs9 months ago
1. Introduction | 1.1 Details | 2. Reprex: the public transport system of Porto Alegre | 3. Setting up the fare structure | 3.1 Global Properties | max_discounted_transfers | transfer_time_allowance | fare_cap | 3.2 Configure fares by transport mode | 3.3 Configure fares by transfers | 3.4 Routes configuration | 4. Calculating travel time and accessibiilty accounting for monetary costs | 4.1 Travel time with monetary cost | 4.2 Calculating accessibility with monetary cost | Cleaning up after usage
Trade-offs between travel time and monetary cost9 months ago
1. Introduction | 2. What the pareto_frontier means. | 3. Demonstration of pareto_frontier(). | 3.1 Build routable transport network with build_network() | 3.2 Set up the fare structure | 3.3 Calculating a pareto_frontier(). | Cleaning up after usage | References
Travel time matrices9 months ago
1. Introduction | 2. Build routable transport network with build_network() | 3. The travel_time_matrix() function | 4. The expanded_travel_time_matrix() function | 4. The arrival_travel_time_matrix() function | Cleaning up after usage | References
Using the time_window parameter9 months ago
1. Introduction | The problem | The solution | 2. How the time_window works and how to interpret the results. | 3. Demonstration of time_window. | 3.1 Build routable transport network with build_network() | 3.2 Accessibility with time_window. | 3.3 Travel time matrix with time_window. | 3.4 Expanded travel time matrix with time_window. | 3.5 Detailed itineraries with time_window. | Cleaning up after usage | References
Data quality diagnosis10 months ago
Preface | Supported data structures | Data: nycflights13 | Data diagnosis | General diagnosis of all variables with diagnose() | Diagnosis of numeric variables with diagnose_numeric() | Diagnosis of categorical variables with diagnose_category() | Diagnosing outliers with diagnose_outlier() | Visualization of outliers using plot_outlier() | Visualization for missing values | visualize pareto chart using plot_na_pareto() | visualize combination chart using plot_na_hclust() | visualize combination chart using plot_na_intersect() | Automated report | Create a diagnostic report using diagnose_web_report() | Contents of dynamic web report | Some arguments for dynamic web report | Screenshot of dynamic report | Create a diagnostic report using diagnose_paged_report() | Contents of static paged report | Some arguments for static paged report | Screenshot of static report | Diagnosing tables in DBMS | Preparing table data | Diagnose data quality of variables in the DBMS | Diagnose data quality of categorical variables in the DBMS | Diagnose data quality of numerical variables in the DBMS | Diagnose outlier of numerical variables in the DBMS | Plot outlier information of numerical data diagnosis in the DBMS | Reporting the information of data diagnosis for table of thr DBMS
Data Transformation10 months ago
Preface | datasets | Imputation of missing values | imputes the missing value with imputate_na() | Collaboration with dplyr | Impute outliers | imputes the outliers with imputate_outlier() | Standardization and Resolving Skewness | Introduction to the use of transform() | Standardization with transform() | Resolving Skewness data with transform() | Binning | Binning of individual variables using binning() | Optimal Binning with binning_by() | Automated report | Create a dynamic report using transformation_web_report() | Contents of dynamic web report | Some arguments for dynamic web report | Screenshot of dynamic report | Create a static report using transformation_paged_report() | Contents of static paged report | Some arguments for static paged report | Screenshot of static report
Exploratory Data Analysis10 months ago
Preface | Supported data structures | datasets | Univariate data EDA | Calculating descriptive statistics using describe() | Test of normality on numeric variables using normality() | Visualization of normality of numerical variables using plot_normality() | EDA of bivariate data | Calculation of correlation coefficient using correlate() | Visualization of the correlation matrix using plot.correlate() | EDA based on target variable | Definition of target variable | EDA when target variable is categorical variable | Cases where predictors are numeric variable | Cases where predictors are categorical variable | EDA when target variable is numerical variable | Automated report | Create a dynamic report using eda_web_report() | Contents of dynamic web report | Some arguments for dynamic web report | Screenshot of dynamic report | Create a EDA report using eda_paged_report() | Contents of static paged report | Some arguments for static paged report | Screenshot of static report | Exploratory data analysis for tables in DBMS | Preparing table data | Calculating descriptive statistics of numerical column of table in the DBMS | Test of normality on numeric columns using in the DBMS | Normalization visualization of numerical column in the DBMS | Compute the correlation coefficient between two columns of the table in DBMS | Visualize correlation plot of numerical columns in the DBMS | Reporting the information of EDA for table of the DBMS
FAQ - Frequently Asked Questions10 months ago
1. Why do some trips from/to the same ID have travel times larger than zero? | 2. Is it possible to run r5r with custom modifications to street nework data? | 3. Why are the output results of time_travel_matrix() and detailed_itineraries() different? | 4. What does the ERROR "Geographic extent of street layer exceeds limit" mean? and what to do about it? | 5. Is it possible to use custom car speed data with r5r? | 6. Why do I get identical results by public transport and walking?
Introduction to tidytransit10 months ago
Introduction | Installation & Dependencies | The General Transit Feed Specification | Read a GTFS Feed | Feed Validation Results | Finding More GTFS Feeds | Feed registries | Using MobilityData
Service Patterns and Calendar Schedules10 months ago
Overview | Prepare data | Analyse Data | Exploration Plot | Names for service patterns | Visualise services | Plot calendar for each service pattern | Plot number of trips per day as calendar
Generate a Departure Timetable10 months ago
Read GTFS data | trip_origin and trip_headsign | Create A Departure Time Table | Trips departing from stop | add route info (route_short_name) | Extract a single day | Simple plot
Transit (GTFS) Service & Headway Mapping with R10 months ago
Introduction | Setup | Outline | 1) Import Transit Data (GTFS) | 2) Identify Weekday Schedules of Service | 3) Calculate Headways | 4) Map Headways By Route | 5) Map Departures by Stop and Route
Using custom OSM car speeds and LTS10 months ago
1. Introduction | 2. Changing car speeds | 2.1 Changing car speeds by OSM edge | 2.1.1 Setting different congestion levels by road hierarchy | 2.1.2 Applying the same speed factor to all roads | Extra tip: | 2.2 Changing car speeds with a spatial polygon | 3. Changing cycling LTS values | 3.1 Changing LTS by OSM edge | 3.2. Changing LTS with a spatial polygon | Cleaning up after usage
Missing or unavailable (NA) objects in Spatstat11 months ago
Extending skimr11 months ago
Introduction | Skimming objects that are not coercible to data frames | Defining sfl's for a package | Adding new methods | Conclusion
Optimize polars performance11 months ago
Lazy vs eager execution | Order of operations | How does polars help? | Use the streaming engine | Use polars functions
Introduction to annotater11 months ago
Package functions | A note on the tidyverse and other metapackages | Usage in RStudio | Annotate package calls in active file | Annotate package repository sources in active file | Annotate titles and repository sources in active file | Annotate each package's function calls | Annotate each package's datasets | Expand metapackages
Black & White Figures for Print Journals11 months ago
Barplots in grey-scaled colors | Lineplots in b/w with different linetypes
Customize Plot Appearance11 months ago
Tweaking plot appearance | Using the Color Brewer palettes | Plot with flipped coordinates | Adding plot margins | Theme options | Pre-defined themes | Pre-defined scales | Set up own themes based on existing themes | Further customization options | Plot legend
Item Analysis of a Scale or an Index11 months ago
Performing an item analysis of a scale or index | Index score with one component | Index score with more than one component | Adding further statistics
Plotting Estimates (Fixed Effects) of Regression Models11 months ago
Fitting a logistic regression model | Plotting estimates of generalized linear models | Sorting estimates | Estimates on the untransformed scale | Showing value labels | Labelling the plot | Pick or remove specific terms from plot | Standardized estimates | Bayesian models (fitted with Stan) | Tweaking plot appearance | References
Plotting Likert Scales11 months ago
Plotting Marginal Effects of Regression Models11 months ago
Marginal effects | Marginal effects for different groups | Marginal effects at specific values or levels | Polynomial terms and splines | Different constant values for factors | Interaction terms
Aggregate Data API Requests1 years ago
Basic IPUMS aggregate data concepts | Metadata for aggregate data projects | Summary metadata | Detailed metadata | Defining an IPUMS aggregate data extract request | Basic extract definitions | Dataset specifications | Time series table specifications | Shapefile specifications | Invalid specifications | More complicated extract definitions | Data layout and file format | Next steps
Reading IPUMS Data1 years ago
IPUMS extract structure | Reading IPUMS microdata extracts | Hierarchical extracts | Reading IPUMS aggregate data extracts | Variable metadata | Handling multiple files | Reading spatial data | Harmonized vs. non-harmonized data
Getting Started with lehdr1 years ago
Introduction | Installation | Usage | Additional Examples | Adding at County level signifiers | Aggregating at the County level | Aggregating Origin-Destination | Aggregating at Block Group, Tract, or State level
Introduction to DBI1 years ago
Who this tutorial is for | How to connect to a database using DBI | Secure password storage | How to retrieve column names for a table | Read a table into a data frame | Read only selected rows and columns into a data frame | How to end a DBMS session | Conclusion | Further Reading
Using tags1 years ago
Interoperability with DuckDB and dbplyr1 years ago
Introduction | From duckplyr to dbplyr | Call arbitrary functions in duckplyr | Conclusion
Extending dataMaid1 years ago
Introduction | Three steps of data documentation | Function templates | Writing a summaryFunction | Writing a visualFunction | Writing a checkFunction | A worked example | summaryFunction examples: countZeros() and meanSummary() | countZeros --- a simple summaryFunction | meanSummary --- an S3 generic summary function | visualFunction examples: mosaicVisual() and prettierHist() | mosaicVisual --- a new visualFunction for categorical data | prettierHist() --- a customized ggplot2 histogram | checkFunction examples: isID() and identifyColons() | isID --- a new checkFunction without problem values | identifyColons --- a new checkFunction with problem values | Calling the new summarize/visualize/check functions from makeDataReport() | A dataMaid report with user-defined functions: Documenting artData
Large data1 years ago
Introduction | To duckplyr | From files | From DuckDB | Materialization | To files | Memory usage | The big picture
Fallback to dplyr1 years ago
Introduction | DuckDB mode | Relation objects | Help from dplyr | Enforce DuckDB operation | Configure fallbacks | Conclusion
Selective use of duckplyr1 years ago
Introduction | External data with explicit qualification | Restoring dplyr methods | Own data | In other packages
Telemetry1 years ago
Implementer's interface1 years ago
Filtering GTFS feeds1 years ago
Filtering by agency_id, route_id, service_id, shape_id, stop_id, trip_id or route_type: | Filtering by day of the week or time of the day: | Filtering using a spatial extent
Introduction to gtfstools1 years ago
GTFS feeds | Basic usage | Read feeds | Analyse feeds | Manipulate feeds | <a name="write-feeds"></a> Write feeds
Validating GTFS feeds1 years ago
Add formulas to a workbook1 years ago
Simple formulas | Array formulas | Array formulas creating multiple fields | cells metadata (cm) formulas | dataTable formulas^[this example was originally provided by @zykezero for openxlsx.] | dataTable formula differences
Implementing a new backend2 years ago
Getting started | Testing | Driver | Connection | Results | SQL methods | Metadata methods | Full DBI compliance
Implementing tidyselect interfaces2 years ago
Before we start | Selections as dots or as named arguments | Do you need tidyselect? | The selection evaluators | Defusing and resuming evaluation of R code | Resuming defused R code with tidyselect rules | Simple selections with dots | Simple selections with named arguments | Renaming selections | Creating selection helpers | Handling duplicate variables
Technical description of tidyselect2 years ago
Sets of variables | Bare names | The : operator | Boolean operators | Dots and c() | Renaming variables | Name combination and propagation | Set combination with named variables | Predicate functions | Selection helpers | Supported data types | Evaluation | Data-expressions and env-expressions | Arithmetic operators | Selecting versus renaming | All renaming inputs must be named | Renaming to an existing variable name | Duplicate columns in data frames | Acknowledgements
Proxies and Certificates on Windows Networks2 years ago
Multiple SSL Backends | Using a Proxy Server | Discovering Your Proxy Server
Welcome to the Tidyverse2 years ago
Summary | Tidyverse package | Components | Design principles | Acknowledgments | References
The tidy tools manifesto2 years ago
Reuse existing data structures | Compose simple functions with the pipe | Embrace functional programming | Design for humans
Query openrouteservice from R2 years ago
Get started | Disclaimer | Installation | Setting up API key | Directions | Isochrones | Matrix | Geocoding | POIs | Elevation | Optimization
Formating with xlsx2 years ago
Formatting with writeData and writeDataTable | Use of pre-defined table styles | Date Formatting | DateTime Formatting | Conditional Formatting | Numeric Formatting
Introduction2 years ago
Basic Examples | write.xlsx | write list of data.frames to xlsx-file | write.xlsx also accepts styling parameters | The simplest way is to set default options and set column class | Workbook styles | define a style for column headers | When writing a list, the stylings will apply to all list elements | write.xlsx returns the workbook object for further editing | Workbook creation walk-through | create workbook and set default border Colour and style | Add Sheets | write data to sheet 1 | write data to sheet 2 | add write group means | add write group variances | add style mean & variance table headers | save workbook | Gallery | Further Examples | Stock Price | Image Compression using PCA
Guide to Function Objects in Spatstat2 years ago
Value Labels in IPUMS data2 years ago
IPUMS variable metadata | Value labels | labelled vs. factor | Cautions regarding labelled variables | Prepping data with value labels | Convert labelled values to other data types | Create missing values based on value labels | Syntax for value label functions | Relabel values | Relabeling caveats | Remove unused value labels | Add new labels | Other resources
Customizing styler2 years ago
How styler works | Implementation details | Showcasing the development of a styling rule | Cache invalidation
The effect of strict = FALSE2 years ago
Distribute custom style guides2 years ago
Reference implementations | Design patterns
stplanr: A Package for Transport Planning2 years ago
Note | Introduction | Package structure and functionality | Core functions and classes | Accessing and processing transport data | Creating geographic desire lines | Allocating flows to the transport network | Modelling travel catchment areas | Modelling and visualisation | Modelling mode choice | Models of travel behaviour | Visualisation | Future directions of travel | References
censusapi2 years ago
Handling shapefiles in the spatstat package2 years ago
Merging route networks2 years ago
Target network preprocessing
Installing and Configuring Drivers2 years ago
Installation | Windows | MacOS | Linux - Debian / Ubuntu | Driver configuration | Data source configuration | Debugging driver and data source configurations | Setting ODBCSYSINI
DBI specification2 years ago
Miscellaneous 2 years ago
What is this EPSG code all about? | Why should we use OGC:CRS84 instead of EPSG:4326? | How does sf deal with secondary geometry columns? | Does st_simplify preserve topology? | Why do my dplyr verbs not work for sf objects?
Get started2 years ago
Entry-points | Passing arguments to the style guide | Invasiveness | scope: What to style? | How strict do you want styler to be? | Ignoring certain lines | Caching | Dry mode | More configuration options | Roxygen code example styling | Custom math token spacing | Custom indention | Custom style guides
Introduce dlookr2 years ago
Preface | Supported data structures | List of supported tasks of data analytics | Diagnose Data | Overall Diagnose Data | Visualize Missing Values | Reporting | EDA | Univariate EDA | Bivariate EDA | Normality Test | Relationship between target variable and predictors | Transform Data | Find Variables | Imputation | Binning | Diagnose Binned Variable | Transformation | Miscellaneous | Statistics | Programming
Using DBI with Arrow2 years ago
Who this tutorial is for | Rationale | New classes and generics | Prepare | Read all rows from a table | Run queries | Prepared queries | Manual flow | Writing data | Appending data | Conclusion
Benchmarks2 years ago
Setup | Writing | Reading
History of DBI3 years ago
Dependency resolution for R package development3 years ago
Package remotes | GitHub | Other sources | CRAN submission
gtfsrouter3 years ago
1 Background: GTFS and other R packages | 2. Routing | 2.1 GTFS Timetables | 2.2. Routing by mode of transport | 2.3. Routing for earliest arrivals or earliest departures | 3. Convenience Functions: go_home() and go_to_work()
Transfer Tables3 years ago
Basic Usage | Extending transfer tables | Modifying transfer tables | Transfer distances | Transfer distances and travel times | A more realistic example
Travel Times3 years ago
Maximum Traveltimes | Fastest Routes versus Minimal-Transfer Routes | The Traveltimes Algorithm
Origin-destination data with stplanr3 years ago
Introduction: what is OD data? | The importance of OD data | An example OD dataset | Origin-destination pairs (long form) | Origin destination matrices | Inter and intra-zonal flows | Oneway lines | Desire lines | Non-matching IDs | A larger example: commuter trips in London | Plotting origin-destination data | Summaries by origin and destination | Further reading | Summary | References
Introducing stplanr3 years ago
Introduction | Installing stplanr | OD data to desire lines and routes | Converting OD data to desire lines with R | Motivations | Further resources | Contributing | References
Route networks with stplanr3 years ago
Introduction | Creating route networks from overlapping routes | Identifying route network groups | Routing on route networks | Adding new nodes | Other approaches
tabyls: a tidy, fully-featured approach to counting things3 years ago
Motivation: why tabyl? | How it works | Examples | One-way tabyl | Two-way tabyl | Three-way tabyl | Other features of tabyls | The adorn_* functions | The adorn functions are: | BYOt (Bring Your Own tabyl) | Questions? Comments?
Overview of janitor functions3 years ago
Major functions | Cleaning | Clean data.frame names with clean_names() | Do those data.frames actually contain the same columns? | Check with compare_df_cols() | Exploring | tabyl() - a better version of table() | Explore records with duplicated values for specific combinations of variables with get_dupes() | Explore relationships between columns with get_one_to_one() | Minor functions | Manipulate vectors of names with make_clean_names() | Validate that a column has a single_value() per group | remove_empty() rows and columns | remove_constant() columns | Directionally-consistent rounding behavior with round_half_up() | Round decimals to precise fractions of a given denominator with round_to_fraction() | Fix dates stored as serial numbers with excel_numeric_to_date() | Convert a mix of date and datetime formats to date | Elevate column names stored in a data.frame row | Find the header row buried within a messy data.frame | Count factor levels in groups of high, medium, and low with top_levels()
Alignment detection3 years ago
Overview | Examples | Details | Function calls | Comments | Assignment
Caching3 years ago
Remove rules3 years ago
Theory | Practice | Some other rules and their transformers
Summary of Bayesian Models as HTML Table3 years ago
Bayesian models summaries as HTML table | Multivariate response models | Show two Credible Interval-column | Mixing multivariate and univariate response models
Plotting Interaction Effects of Regression Models3 years ago
Two-Way-Interactions | Three-Way-Interactions | References
Getting Started4 years ago
Example plots | Basic use | Controlling layout | Stacking and packing plots | Annotating the composition | Want more?
RSQLite4 years ago
Creating a new database | Loading data | Queries | Batched queries | Multiple parameterised queries | Statements
Preparing Data for Interpolation4 years ago
Validating Data | Format Issues | External Spatial Data | sp Data | Data From tidycensus and tigris | Tabular Data | Coordinate Systems | Variable Conflicts | Subsetting
Areal Interpolation in R4 years ago
Motivation | Getting Started | Installation | Functions | Data | Preparing Data | Areal Weighted Interpolation | Getting Help | Suggesting Features or Changes
Robust Estimation of Standard Errors, Confidence Intervals and p-values4 years ago
Classical Regression Models | Robust Covariance Matrix Estimation from Model Parameters | Cluster-Robust Covariance Matrix Estimation (sandwich) | Cluster-Robust Covariance Matrix Estimation (clubSandwich) | Robust Covariance Matrix Estimation on Standardized Model Parameters | Mixed Models | Robust Covariance Matrix Estimation for Mixed Models | Robust Covariance Matrix Estimation on Standardized Mixed Model Parameters
stargazer4 years ago
Introduction | Why Should I Use stargazer? | Citing stargazer in Research Publications
Combining pages of JSON data with jsonlite4 years ago
A bidirectional mapping | Paging with jsonlite | Automatically combining many pages
Fetching JSON data from REST APIs4 years ago
Github | CitiBike NYC | Ergast | ProPublica | New York Times | Twitter
Shiny usage4 years ago
Use esquisse as a Shiny module | Module for saving a ggplot object | Module to render a plot and add export options | Input widgets | dragulaInput | dropInput | colorPicker | palettePicker
Analysing Replicated Point Patterns in Spatstat4 years ago
Getting Started with Spatstat4 years ago
Get started with esquisse5 years ago
Launch the addin | Import data into | Create a plot | Controls | Labels & titles | Plot options | Appearance | Filter | Code | Export | Addin options | Internationalization
Extending dataReporter5 years ago
Introduction | Three steps of data documentation | Function templates | Writing a summaryFunction | Writing a visualFunction | Writing a checkFunction | A worked example | summaryFunction examples: countZeros() and meanSummary() | countZeros --- a simple summaryFunction | meanSummary --- an S3 generic summary function | visualFunction examples: mosaicVisual() and prettierHist() | mosaicVisual --- a new visualFunction for categorical data | prettierHist() --- a customized ggplot2 histogram | checkFunction examples: isID() and identifyColons() | isID --- a new checkFunction without problem values | identifyColons --- a new checkFunction with problem values | Calling the new summarize/visualize/check functions from makeDataReport() | A dataReporter report with user-defined functions: Documenting artData
Transport routing with stplanr5 years ago
Introduction | OSRM
Summary of Regression Models as HTML Table5 years ago
A simple HTML table from regression results | Automatic labelling | Turn off automatic labelling | More than one model | Generalized linear models | Untransformed estimates on the linear scale | More complex models | Show or hide further columns | Adding columns | Removing columns | Removing and sorting columns | Collapsing columns | Defining own labels | Including reference level of categorical predictors | Style of p-values | Automatic matching for named vectors | Keep or remove coefficients from the table
Using flextable5 years ago
Using officedown6 years ago
An Introduction to the DT Package6 years ago
Customizing HTML tables6 years ago
Copying table output to office or word processors | Export table as HTML file to open in word processors | Drag and drop from browser or RStudio viewer pane | Customizing table output with the CSS parameter | Retrieving customizable styles | Pre-defined Table-Layouts
Parallel routing and performance with stplanr6 years ago
With old route_cyclestreets function | With new route function | With new route function in parallel | In parallel with quietness plan | Tests
Summary of Mixed Models as HTML Table6 years ago
Mixed models summaries as HTML table | Generalized linear mixed models | More complex models | References
How to Datapasta6 years ago
Typical Usage with Rstudio | Pasting a table as a formatted tibble definition with tribble_paste() | Pasting a list as a horizontal vector with vector_paste() | Pasting a list as a vertical vector with vector_paste_vertical() | Outputting data from your R environment | Output to R with dpasta() | Avoiding fiddly data formatting | Fiddle Selections until they're better | Toggle Quotes | Output to clipboard with dmdclip() | Usage without RStudio | Custom Behaviour for Your Unique Snowflake Setup | Configurable Options | Upping the row guard | Dealing with "," decimal marks
Datapasta in the cloud6 years ago
Fallback 1: Text selection | Fallback 2: Pop-up text editor | Configuration
Skimr defaults7 years ago
Introduction | The base skimmers | Default skimmers
Using Fonts7 years ago
Gathering QWI Data Over Several Years for Multiple States7 years ago
Introduction
Areal Weighted Interpolation7 years ago
Introduction to Areal Weighted Interpolation | Step 1: Intersection | Step 2: Areal Weights | Step 3: Estimate Population | Step 4: Summarize Data | Extensive and Intensive Interpolations | Extensive Interpolations | Calculating Weights for Extensive Interpolations | Weights Example 1: Non-Overlap Due to Data Quality | Weights Example 2: Non-Overlap Due to Differing Boundaries | Intensive Interpolations | Mixed Interpolations | Output Options | Other Features of aw_interpolate | Manual Workflow | Intersect Data | Calculate Total Area | Calculate Areal Weight | Calculate Estimated Population | Aggregate Estimated Population by Target ID
RJournal 6 111-122 (2014)8 years ago
stringdist C/C++ API8 years ago
(Unofficial) overview of gtable9 years ago
Constructing a gtable | Components of a gtable | Modifying a gtable | Examples to alter ggplot2 plots with gtable
Displaying tables as grid graphics9 years ago
Basic usage | Spacing | Aesthetic formatting | Text justification | Further gtable processing and integration | Borders and separators | Accessing existing grobs in the table | Faster tables: an alternative grid function
Getting started with JSON and jsonlite9 years ago
Simplification | Atomic Vectors | Data Frames | Matrices and Arrays
Regular polygons and ellipses in grid graphics11 years ago
Basic usage | Rotated and stretched polygons | Ellipses
Arranging multiple grobs on a page11 years ago
Basic usage | Title and/or annotations | Complex layouts | Nested layouts with arrangeGrob | Multiple pages output
A mapping between JSON data and R objects11 years ago